Systems and methods for detecting and tracking movable objects

ABSTRACT

A method for supporting visual tracking includes receiving a plurality of image signals indicative of a plurality of image frames captured by an imaging device over a period of time while the imaging device is in motion. Each image frame includes a plurality of pixels. The method further includes obtaining motion characteristics of the imaging device based on a plurality of motion signals, and analyzing the plurality of image signals based on the motion characteristics of the imaging device, so as to compute movement characteristics associated with the plurality of pixels.

CROSS-REFERENCE

This application is a continuation of application Ser. No. 16/266,773,filed on Feb. 4, 2019, which is a continuation of application Ser. No.15/366,857, filed on Dec. 1, 2016, now U.S. Pat. No. 10,198,634, whichis a continuation of International Application No. PCT/CN2015/089464,filed on Sep. 11, 2015. The above-referenced applications are herebyincorporated by reference in their entireties.

BACKGROUND

In some surveillance, reconnaissance, and exploration tasks forreal-world applications, one or more objects may need to be detected andtracked. Conventional tracking methods may be based on globalpositioning system (GPS) data or camera vision. However, conventionalGPS-based or vision-based tracking methods may be inadequate for certainapplications. For example, conventional GPS-based tracking methods maynot be useful in places with poor GPS signal reception or if the trackedobjects do not have GPS receivers located on them. Conventionalvision-based tracking methods may lack the capability for preciselytracking a group of moving objects. An aerial vehicle carrying a payload(e.g., a camera) can be used to track objects. In some cases, one ormore operators may have to manually select the moving objects to betracked, and manually control the aerial vehicle/camera to track themoving objects. This limited tracking ability may reduce the usefulnessof aerial vehicles in certain applications.

SUMMARY

A need exists to improve conventional tracking methods such asvision-based tracking methods. The improved tracking capabilities mayallow an imaging device to automatically detect one or more movingobjects and to autonomously track the moving objects, without requiringmanual input and/or operation by a user. The improved trackingcapabilities may be particularly useful when the imaging device is usedto precisely track a fast-moving group of objects, whereby the sizeand/or shape of the group may be amorphous and change over time as theobjects move. The improved tracking capabilities can be incorporatedinto an aerial vehicle, such as an unmanned aerial vehicle (UAV).

In vision-based tracking methods, a target object may be tracked usingan imaging device located on an aerial vehicle. Conventionalvision-based tracking methods can be manual or automatic.

For example, in a vision-based manual tracking method, an image may befirst captured using the imaging device, and an operator may manuallyselect a target object to be tracked from the image. The manualselection may be performed using an input device, for example, a tablet,a mobile device, or a personal computer (PC). In some instances, theaerial vehicle may be configured to automatically track the targetobject after it has been manually selected by the operator using theinput device. In other instances, the operator may continue to manuallycontrol the aerial vehicle to track the target object even after it hasbeen selected.

Conversely, in a vision-based automatic tracking method, automatictracking may be implemented using tracking algorithms that canautomatically detect a particular type of object, or an object carryinga marker. The type of object may be based on different object classes(e.g., people, buildings, landscape, etc.). The marker may include oneor more optical markers comprising unique patterns.

In conventional vision-based tracking methods, a target object may bedefined based on predetermined features (e.g., color, structure, salientfeatures, etc.) and/or by modeling (e.g., object class). After thetarget object has been defined, movement of the features and/or modelmay be detected and calculated in real-time as the target object moves.In these methods, a high-level consistency in the features and/or modelmay be typically required for precise tracking of the target object. Inparticular, the level of tracking precision may depend on the spatialrelations between the features and/or an error in the model.

Although conventional vision-based tracking methods can be used to tracka single object, they may be inadequate for tracking a group of movingobjects. In particular, conventional vision-based tracking methods maylack the capability to precisely track a fast-moving group of objects,whereby the size and/or shape of the group may be amorphous and changeover time as the objects move. Examples of such groups of objects mayinclude, but are not limited to, groups of moving animals (e.g., a herdof horses running on the plains, or a flock of birds flying in differentformations), groups of people (e.g., a large crowd of people moving in aparade), groups of vehicles (e.g., a squadron of airplanes performingaerial acrobatics), or groups comprising different objects moving indifferent formations (e.g., a group comprising of moving animals,people, and vehicles to be tracked).

In a conventional global positioning system (GPS)-based tracking method,an imaging device and a target object may each be provided with GPSapparatus (e.g., a GPS receiver). A spatial relation between the imagingdevice and the target object may be calculated based on estimates oftheir real-time locations. The imaging device may be configured to trackthe target object based on their spatial relation. However, this methodmay be limited by GPS signal quality and availability of GPS signals.For example, conventional global positioning system (GPS)-based trackingmethods may not work indoors, or when GPS signal reception is blocked bybuildings and/or natural terrain features such as valleys, mountains,etc. Furthermore, these methods are predicated on GPS tracking, and thuscannot be used when the target object(s) (e.g., a group of animals) donot carry GPS apparatus.

In addition, the tracking accuracy in conventional GPS-based trackingmethods may be limited, given that the location accuracy of a typicalGPS receiver ranges from about 2 meters to about 4 meters. In someinstances, an aerial vehicle and a target object may be movingconcurrently. However, their estimated positions and velocities from GPSsignals may not be updated at a sufficient frequency in real-time thatallows for high precision tracking. For example, there may be a timedelay or a lack of correlation between the estimated positions andvelocities of the UAV and the target object. This may compound theinherent GPS positioning errors (2˜4 m) of the UAV and target object,and result in a further decrease in tracking precision/accuracy.

Accordingly, a need exists to improve the tracking capabilities androbustness of an aerial vehicle under different conditions for a varietyof applications requiring high accuracy/precision. The conditions mayinclude both indoor and outdoor environments, places without GPS signalsor places that have poor GPS signal reception, a variety of differentterrain, etc. The applications may include precise tracking of a movingtarget object and/or a group of moving target objects. The targetobjects may include target objects that do not carry GPS apparatus,target objects that do not have well-defined features or that do notfall into known object classes, target objects that collectively form agroup whereby the size and/or shape of the group may be amorphous andchange over time, a plurality of different target objects moving indifferent formations, or any combination of the above. Systems, methods,and devices are provided herein to address at least the above needs.

For instance, in some aspects of the disclosure, a method for supportingvisual tracking is provided. The method may comprise: receiving aplurality of image frames captured at different times using an imagingdevice, wherein each image frame comprises a plurality of pixels thatare associated with a plurality of feature points; analyzing theplurality of image frames to compute movement characteristics of theplurality of feature points; and identifying at least one trackingfeature relative to at least one background feature based on themovement characteristics of the plurality of feature points.

According to an aspect of the disclosure, an apparatus for supportingvisual tracking is provided. The apparatus may comprise one or moreprocessors that are, individually or collectively, configured to:receive a plurality of image frames captured at different times using animaging device, wherein each image frame comprises a plurality of pixelsthat are associated with a plurality of feature points; analyze theplurality of image frames to compute movement characteristics of theplurality of feature points; and identify at least one tracking featurerelative to at least one background feature based on the movementcharacteristics of the plurality of feature points.

According to another aspect of the disclosure, a non-transitorycomputer-readable medium storing instructions that, when executed,causes a computer to perform a method for supporting visual tracking, isprovided. The method may comprise: receiving a plurality of image framescaptured at different times using an imaging device, wherein each imageframe comprises a plurality of pixels that are associated with aplurality of feature points; analyzing the plurality of image frames tocompute movement characteristics of the plurality of feature points; andidentifying at least one tracking feature relative to at least onebackground feature based on the movement characteristics of theplurality of feature points.

A visual tracking system may be provided in accordance with anadditional aspect of the disclosure. The system may comprise: an imagingdevice, and one or more processors that are, individually orcollectively, configured to: receive a plurality of image framescaptured at different times using the imaging device, wherein each imageframe comprises a plurality of pixels that are associated with aplurality of feature points; analyze the plurality of image frames tocompute movement characteristics of the plurality of feature points; andidentify at least one tracking feature relative to at least onebackground feature based on the movement characteristics of theplurality of feature points.

Further aspects of the disclosure may be directed to a method forsupporting visual tracking. The method may comprise: receiving aplurality of image signals, which are indicative of a plurality of imageframes captured by an imaging device over a period of time while theimaging device is in motion, wherein each image frame comprises aplurality of pixels; obtaining motion characteristics of the imagingdevice based on a plurality of motion signals; and analyzing theplurality of image signals based on the motion characteristics of theimaging device, so as to compute movement characteristics associatedwith the plurality of pixels.

According to an aspect of the disclosure, an apparatus for supportingvisual tracking is provided. The apparatus may comprise one or moreprocessors that are, individually or collectively, configured to:receive a plurality of image signals, which are indicative of aplurality of image frames captured by an imaging device over a period oftime while the imaging device is in motion, wherein each image framecomprises a plurality of pixels; obtain motion characteristics of theimaging device based on a plurality of motion signals; and analyze theplurality of image signals based on the motion characteristics of theimaging device, so as to compute movement characteristics associatedwith the plurality of pixels.

According to another aspect of the disclosure, a non-transitorycomputer-readable medium storing instructions that, when executed,causes a computer to perform a method for supporting visual tracking, isprovided. The method may comprise: receiving a plurality of imagesignals, which are indicative of a plurality of image frames captured byan imaging device over a period of time while the imaging device is inmotion, wherein each image frame comprises a plurality of pixels;obtaining motion characteristics of the imaging device based on aplurality of motion signals; and analyzing the plurality of imagesignals based on the motion characteristics of the imaging device, so asto compute movement characteristics associated with the plurality ofpixels.

An unmanned aerial vehicle (UAV) may be provided in accordance with anadditional aspect of the disclosure. The UAV may comprise: a visualtracking system comprising an imaging device, and one or more processorsthat are, individually or collectively, configured to: receive aplurality of image signals, which are indicative of a plurality of imageframes captured by the imaging device over a period of time while theimaging device is in motion, wherein each image frame comprises aplurality of pixels; obtain motion characteristics of the imaging devicebased on a plurality of motion signals; and analyze the plurality ofimage signals based on the motion characteristics of the imaging device,so as to compute movement characteristics associated with the pluralityof pixels.

Further aspects of the disclosure may be directed to a method forsupporting visual tracking. The method may comprise: obtaining, via amobile visual tracking device, movement characteristics of a pluralityof feature points; selecting a group of feature points from theplurality of feature points based on the movement characteristics of theplurality of feature points; and tracking the group of feature points byadjusting motion characteristics of the mobile visual tracking device,so as to substantially position the group of feature points in a targetregion of each image frame captured using the mobile visual trackingdevice.

According to an aspect of the disclosure, an apparatus for supportingvisual tracking is provided. The apparatus may comprise one or moreprocessors that are, individually or collectively, configured to:obtain, via a mobile visual tracking device, movement characteristics ofa plurality of feature points; select a group of feature points from theplurality of feature points based on the movement characteristics of theplurality of feature points; and track the group of feature points byadjusting motion characteristics of the mobile visual tracking device,so as to substantially position the group of feature points in a targetregion of each image frame captured using the mobile visual trackingdevice.

According to another aspect of the disclosure, a non-transitorycomputer-readable medium storing instructions that, when executed,causes a computer to perform a method for supporting visual tracking isprovided. The method may comprise: obtaining, via a mobile visualtracking device, movement characteristics of a plurality of featurepoints; selecting a group of feature points from the plurality offeature points based on the movement characteristics of the plurality offeature points; and tracking the group of feature points by adjustingmotion characteristics of the mobile visual tracking device, so as tosubstantially position the group of feature points in a target region ofeach image frame captured using the mobile visual tracking device.

An unmanned aerial vehicle (UAV) may be provided in accordance with anadditional aspect of the disclosure. The UAV may comprise: a visualtracking system comprising an imaging device, and one or more processorsthat are, individually or collectively, configured to: obtain, via amobile visual tracking device, movement characteristics of a pluralityof feature points; select a group of feature points from the pluralityof feature points based on the movement characteristics of the pluralityof feature points; and track the group of feature points by adjustingmotion characteristics of the mobile visual tracking device, so as tosubstantially position the group of feature points in a target region ofeach image frame captured using the mobile visual tracking device.

It shall be understood that different aspects of the disclosure can beappreciated individually, collectively, or in combination with eachother. Various aspects of the disclosure described herein may be appliedto any of the particular applications set forth below or for any othertypes of movable objects. Any description herein of an aerial vehiclemay apply to and be used for any movable object, such as any vehicle.Additionally, the systems, devices, and methods disclosed herein in thecontext of aerial motion (e.g., flight) may also be applied in thecontext of other types of motion, such as movement on the ground or onwater, underwater motion, or motion in space.

Other objects and features of the present disclosure will becomeapparent by a review of the specification, claims, and appended figures.

INCORPORATION BY REFERENCE

All publications, patents, and patent applications mentioned in thisspecification are herein incorporated by reference to the same extent asif each individual publication, patent, or patent application wasspecifically and individually indicated to be incorporated by reference.

BRIEF DESCRIPTION OF THE DRAWINGS

The novel features of the invention are set forth with particularity inthe appended claims. A better understanding of the features andadvantages of the present disclosure will be obtained by reference tothe following detailed description that sets forth illustrativeembodiments, in which the principles of the disclosure are utilized, andthe accompanying drawings of which:

FIG. 1 illustrates a block diagram of a visual tracking systemcomprising an exemplary image analyzer, in accordance with someembodiments;

FIG. 2 illustrates the identification of a tracking feature and abackground feature in a sequence of exemplary image frames using theimage analyzer of FIG. 1, in accordance with some embodiments;

FIG. 3 illustrates different movement characteristics of a pixel in theimage frames, in accordance with some embodiments;

FIG. 4 illustrates a sequence of exemplary image frames whereby the sizeof the contour surrounding a tracking feature increases, in accordancewith some embodiments;

FIG. 5 illustrates a sequence of exemplary image frames whereby the sizeof the contour surrounding a tracking feature decreases, in accordancewith some embodiments;

FIG. 6 illustrates a sequence of exemplary image frames whereby the sizeof the contour surrounding a tracking feature increases, in accordancewith some other embodiments;

FIG. 7 illustrates a sequence of exemplary image frames whereby the sizeof the contour surrounding a tracking feature decreases, in accordancewith some other embodiments;

FIG. 8 illustrates a change in size and/or shape of a contoursurrounding a tracking feature with the movement, convergence,divergence, addition, and/or subtraction of one or more target objectsof different object classes, in accordance with some embodiments;

FIG. 9 illustrates a change in size and/or shape of a contoursurrounding a tracking feature as the number of target objects changes,or when the target objects move collectively in a random manner, inaccordance with some embodiments;

FIGS. 10, 11, and 12 illustrate the tracking of target objects by animaging device whereby a size and/or shape of a contour surrounding atracking feature remains relatively constant as the target objects movefrom one location to another, in accordance with different embodiments;

FIGS. 13 and 14 illustrate the tracking of target objects by an imagingdevice whereby a size and/or shape of a contour surrounding a trackingfeature changes as the target objects move from one location to another,in accordance with different embodiments;

FIG. 15 illustrates a visual tracking system comprising an imageanalyzer for computing movement characteristics of a plurality of pixelsbased on motion characteristics of an imaging device, in accordance withsome embodiments;

FIG. 16 illustrates an example of computation of movementcharacteristics of a plurality of pixels in a sequence of exemplaryimage frames using the image analyzer of FIG. 15, in accordance withsome embodiments;

FIGS. 17, 18, and 19 illustrate different embodiments in which animaging device is tracking a group of target objects, in accordance withsome embodiments;

FIG. 20 illustrates exemplary movements of a background feature and atracking feature in a sequence of exemplary image frames, in accordancewith some embodiments;

FIG. 21 illustrates exemplary movements of a background feature and atracking feature in a sequence of exemplary image frames, in accordancewith some other embodiments;

FIG. 22 illustrates exemplary movements of a background feature and atracking feature in a sequence of exemplary image frames, in accordancewith some further embodiments;

FIG. 23 illustrates an imaging device tracking a target object in acurvilinear manner along an arc, in accordance with some embodiments;

FIG. 24 illustrates exemplary movements of a background feature and atracking feature in a sequence of exemplary image frames, in accordancewith some additional embodiments;

FIG. 25 illustrates exemplary movement of a background feature and atracking feature in a sequence of exemplary image frames, in accordancewith some more additional embodiments;

FIG. 26 illustrates a visual tracking system configured to track a groupof feature points by adjusting motion characteristics of a mobile visualtracking device, in accordance with some embodiments;

FIG. 27 illustrates the tracking of a group of feature points in asequence of exemplary image frames using the mobile visual trackingsystem of FIG. 26, in accordance with some embodiments;

FIG. 28 illustrates the tracking of a constantly changing group offeature points in a sequence of exemplary image frames using the mobilevisual tracking system of FIG. 26, in accordance with some embodiments;

FIG. 29 illustrates the tracking of subsets of feature points using themobile visual tracking system of FIG. 26, in accordance with someembodiments; and

FIG. 30 is a schematic block diagram of a system for controlling amovable object, in accordance with some embodiments.

DETAILED DESCRIPTION

Systems, methods, and devices provided herein permit a moving object ora group of moving objects to be identified and/or tracked with highprecision and/or accuracy. This can improve the identification and/ortracking capabilities of a tracking device. In some instances, thesystems, methods, and devices provided herein can identify particularvisual features in a plurality of image frames regardless whether thoseparticular visual features are tracked.

In some embodiments, a plurality of image frames may be captured atdifferent times using an imaging device. Each image frame may comprise aplurality of pixels that are associated with a plurality of featurepoints. The plurality of image frames may be analyzed to computemovement characteristics of the plurality of feature points. At leastone tracking feature relative to at least one background feature may beidentified based on the movement characteristics of the plurality offeature points. The tracking feature may be associated with one or moremoving objects, and the background feature may be associated with one ormore stationary objects. Accordingly, the moving objects and thestationary objects may be identified by distinguishing the trackingfeature from the background feature.

In some other embodiments, one or more moving objects can be trackedwhile the imaging device is in motion. In those embodiments, a pluralityof image signals may be received. The image signals may be indicative ofa plurality of image frames captured by the imaging device over a periodof time while the imaging device is in motion. Each image frame maycomprise a plurality of pixels. Motion characteristics of the imagingdevice may be obtained based on a plurality of motion signals associatedwith the imaging device. The plurality of image signals may be analyzedbased on the motion characteristics of the imaging device, so as tocompute movement characteristics associated with the plurality ofpixels. In some instances, a correlation between the movementcharacteristics associated with the plurality of pixels and the motioncharacteristics of the imaging device may be obtained.

In some further embodiments, one or more moving objects can be trackedby adjusting motion characteristics of a mobile visual tracking device.In those embodiments, movement characteristics of a plurality of featurepoints may be obtained via a mobile visual tracking device. A group offeature points from the plurality of feature points may be selectedbased on the movement characteristics of the plurality of featurepoints. The group of feature points may be associated with the one ormore moving objects. The group of feature points may be tracked byadjusting motion characteristics of the mobile visual tracking device,so as to substantially position the group of feature points in a targetregion of each image frame captured using the mobile visual trackingdevice.

Accordingly, one or more moving objects can be detected and preciselytracked using the systems, methods, and devices provided herein. Themoving objects may include moving objects that do not carry GPSapparatus, moving objects that do not have well-defined features or thatdo not fall into known object classes, moving objects that cannot beeasily detected using conventional object recognition methods, movingobjects that collectively form a group whereby the size and/or shape ofthe group may be amorphous and change over time, a plurality ofdifferent objects moving in different formations, or any combination(s)of the above.

It shall be understood that different aspects of the disclosure can beappreciated individually, collectively, or in combination with eachother. Various aspects of the disclosure described herein may be appliedto any of the particular applications set forth below or for any othertypes of remotely controlled vehicles or movable objects.

The present disclosure provides embodiments of systems, devices, and/ormethods for improving the tracking capabilities of an imaging device,e.g., supported by an unmanned aerial vehicle (UAV), and that enableautonomous tracking of a group of moving objects. Description of the UAVmay apply to any type of vehicle, such as land-bound, underground,underwater, water surface, aerial, or space-based vehicles.

FIG. 1 illustrates a block diagram of a visual tracking system 100comprising an exemplary image analyzer, in accordance with someembodiments. The visual tracking system may be implemented as astand-alone system, and need not be provided on a vehicle. In some otherembodiments, the visual tracking system may be provided on a vehicle. Asshown in FIG. 1, the visual tracking system may include an imagingdevice 110 and an image analyzer 120. The visual tracking system may beconfigured to identify at least one tracking feature relative to atleast one background feature, based on movement characteristics of aplurality of feature points.

An imaging device as used herein may serve as an image capture device.An imaging device may be a physical imaging device. An imaging devicecan be configured to detect electromagnetic radiation (e.g., visible,infrared, and/or ultraviolet light) and generate image data based on thedetected electromagnetic radiation. An imaging device may include acharge-coupled device (CCD) sensor or a complementarymetal-oxide-semiconductor (CMOS) sensor that generates electricalsignals in response to wavelengths of light. The resultant electricalsignals can be processed to produce image data. The image data generatedby an imaging device can include one or more images, which may be staticimages (e.g., photographs), dynamic images (e.g., video), or suitablecombinations thereof. The image data can be polychromatic (e.g., RGB,CMYK, HSV) or monochromatic (e.g., grayscale, black-and-white, sepia).The imaging device may include a lens configured to direct light onto animage sensor.

In some embodiments, the imaging device can be a camera. A camera can bea movie or video camera that captures dynamic image data (e.g., video).A camera can be a still camera that captures static images (e.g.,photographs). A camera may capture both dynamic image data and staticimages. A camera may switch between capturing dynamic image data andstatic images. Although certain embodiments provided herein aredescribed in the context of cameras, it shall be understood that thepresent disclosure can be applied to any suitable imaging device, andany description herein relating to cameras can also be applied to anysuitable imaging device, and any description herein relating to camerascan also be applied to other types of imaging devices. A camera can beused to generate 2D images of a 3D scene (e.g., an environment, one ormore objects, etc.). The images generated by the camera can representthe projection of the 3D scene onto a 2D image plane. Accordingly, eachpoint in the 2D image corresponds to a 3D spatial coordinate in thescene. The camera may comprise optical elements (e.g., lens, mirrors,filters, etc). The camera may capture color images, greyscale image,infrared images, and the like. The camera may be a thermal imagingdevice when it is configured to capture infrared images.

The imaging device may capture an image or a sequence of images at aspecific image resolution. In some embodiments, the image resolution maybe defined by the number of pixels in an image. In some embodiments, theimage resolution may be greater than or equal to about 352×420 pixels,480×320 pixels, 720×480 pixels, 1280×720 pixels, 1440×1080 pixels,1920×1080 pixels, 2048×1080 pixels, 3840×2160 pixels, 4096×2160 pixels,7680×4320 pixels, or 15360×8640 pixels. In some embodiments, the cameramay be a 4K camera or a camera with a higher resolution.

The imaging device may capture a sequence of images at a specificcapture rate. In some embodiments, the sequence of images may becaptured standard video frame rates such as about 24p, 25p, 30p, 48p,50p, 60p, 72p, 90p, 100p, 120p, 300p, 50i, or 60i. In some embodiments,the sequence of images may be captured at a rate less than or equal toabout one image every 0.0001 seconds, 0.0002 seconds, 0.0005 seconds,0.001 seconds, 0.002 seconds, 0.005 seconds, 0.01 seconds, 0.02 seconds,0.05 seconds. 0.1 seconds, 0.2 seconds, 0.5 seconds, 1 second, 2seconds, 5 seconds, or 10 seconds. In some embodiments, the capture ratemay change depending on user input and/or external conditions (e.g.rain, snow, wind, unobvious surface texture of environment).

The imaging device may have adjustable parameters. Under differingparameters, different images may be captured by the imaging device whilesubject to identical external conditions (e.g., location, lighting). Theadjustable parameter may comprise exposure (e.g., exposure time, shutterspeed, aperture, film speed), gain, gamma, area of interest,binning/subsampling, pixel clock, offset, triggering, ISO, etc.Parameters related to exposure may control the amount of light thatreaches an image sensor in the imaging device. For example, shutterspeed may control the amount of time light reaches an image sensor andaperture may control the amount of light that reaches the image sensorin a given time. Parameters related to gain may control theamplification of a signal from the optical sensor. ISO may control thelevel of sensitivity of the camera to available light. Parameterscontrolling for exposure and gain may be collectively considered and bereferred to herein as EXPO.

In some alternative embodiments, an imaging device may extend beyond aphysical imaging device. For example, an imaging device may include anytechnique that is capable of capturing and/or generating images or videoframes. In some embodiments, the imaging device may refer to analgorithm that is capable of processing images obtained from anotherphysical device.

In the example of FIG. 1, the imaging device may be configured tocapture image data of a plurality of objects 102. The image data maycorrespond to, for example, still images or video frames of theplurality of objects. The objects may include any physical object orstructure that can be optically identified and/or tracked in real-timeby the visual tracking system. Optical tracking has several advantages.For example, optical tracking allows for wireless ‘sensors’, is lesssusceptible to noise, and allows for many objects (e.g., different typesof objects) to be tracked simultaneously. The objects can be depicted instill images and/or video frames in a 2D or 3D format, can be real-lifeand/or animated, can be in color, black/white, or grayscale, and can bein any color space.

As shown in FIG. 1, a visual path (denoted by a dotted line) is providedbetween the imaging device and the plurality of objects, such that theobjects lie in the field-of-view of the imaging device. In someembodiments, the objects may be operatively connected to one or more ofthe components in FIG. 1. For example, the objects may be incommunication with one or more of the components in system 100. In someembodiments, the objects may include GPS apparatus (e.g., a GPSreceiver) disposed thereon.

In some other embodiments, the objects need not be operatively connectedto any of the components in FIG. 1. For example, the objects need not bein communication with any of the components in system 100. The objectsalso need not include any GPS apparatus (e.g., a GPS receiver) disposedthereon. Instead, the objects can be any stand-alone physical object orstructure. Some of the objects may be capable of motion (e.g.,translation and/or rotation, land-bound travel, aerial flight, etc.).Any type, range, and magnitude of motion of some or all of the objectsmay be contemplated, as described below.

The objects may be generally classified into target objects andbackground objects. Target objects as used herein refer to objects thatare capable of motion, and may be moving or stationary at any givenpoint in time. In some instances, when the target objects are moving,the target objects may be referred to as moving objects. Examples oftarget objects may include a living subject, such as a human or ananimal, or a group of humans or a group of animals. Alternatively, thetarget object may be carried by a living subject, such as a human or ananimal, or a movable object such as a vehicle. Background objects asused herein generally refer to objects that are substantially affixed ata location. Background objects may be incapable of motion, such asstationary objects. Examples of background objects may includegeographic features, plants, landmarks, buildings, monolithicstructures, or any fixed structures.

The target object may also be any object configured to move within anysuitable environment, such as in air (e.g., a fixed-wing aircraft, arotary-wing aircraft, or an aircraft having neither fixed wings norrotary wings), in water (e.g., a ship or a submarine), on ground (e.g.,a motor vehicle, such as a car, truck, bus, van, motorcycle; a movablestructure or frame such as a stick, fishing pole; or a train), under theground (e.g., a subway), in space (e.g., a spaceplane, a satellite, or aprobe), or any combination of these environments.

The target object may be capable of moving freely within the environmentwith respect to six degrees of freedom (e.g., three degrees of freedomin translation and three degrees of freedom in rotation). Alternatively,the movement of the target object can be constrained with respect to oneor more degrees of freedom, such as by a predetermined path, track, ororientation. The movement can be actuated by any suitable actuationmechanism, such as an engine or a motor. The actuation mechanism of thetarget object can be powered by any suitable energy source, such aselectrical energy, magnetic energy, solar energy, wind energy,gravitational energy, chemical energy, nuclear energy, or any suitablecombination thereof. The target object may be self-propelled via apropulsion system, such as described further below. The propulsionsystem may optionally run on an energy source, such as electricalenergy, magnetic energy, solar energy, wind energy, gravitationalenergy, chemical energy, nuclear energy, or any suitable combinationthereof.

In some instances, the target object can be a vehicle, such as aremotely controlled vehicle. Suitable vehicles may include watervehicles, aerial vehicles, space vehicles, or ground vehicles. Forexample, aerial vehicles may be fixed-wing aircraft (e.g., airplane,gliders), rotary-wing aircraft (e.g., helicopters, rotorcraft), aircrafthaving both fixed wings and rotary wings, or aircraft having neither(e.g., blimps, hot air balloons). A vehicle can be self-propelled, suchas self-propelled through the air, on or in water, in space, or on orunder the ground. A self-propelled vehicle can utilize a propulsionsystem, such as a propulsion system including one or more engines,motors, wheels, axles, magnets, rotors, propellers, blades, nozzles, orany suitable combination thereof. In some instances, the propulsionsystem can be used to enable the movable object to take off from asurface, land on a surface, maintain its current position and/ororientation (e.g., hover), change orientation, and/or change position.

In some embodiments, the target object may be tracked by a trackingdevice. The tracking device may be an imaging device, or a movableobject carrying an image device. The movable object may be, for example,a UAV. The target object may be a same type of movable object as thetracking device, or may be a different type of movable object as thetracking device. For instance, in some embodiments, both the trackingdevice and the target object may be UAVs. The tracking device and thetarget object may be the same type of UAV or different types of UAVs.Different types of UAVs may have different shapes, form factors,functionality, or other characteristics. The target object and thetracking device may move in 3-dimensional space relative to thebackground object. As previously described, examples of backgroundobjects may include geographic features (e.g., mountains), landmarks(e.g., bridges), buildings (e.g., skyscrapers, stadiums, etc.), or anyfixed structures.

As shown in FIG. 1, the image data captured by the imaging device may beencoded in a plurality of image signals 112. The plurality of imagesignals may be generated using the imaging device. The image signals maycomprise a plurality of image frames captured at different times usingthe imaging device. For example, the image signals may comprise a firstimage frame 112-1 captured at time T1 and a second image frame 112-2captured at time T2, whereby time T2 may be a point in time occurringafter time T1. Each image frame may comprise a plurality of pixels. Insome embodiments, the plurality of image frames may comprise a pluralityof color images, and the plurality of pixels may comprise color pixels.In other embodiments, the plurality of image frames may comprise aplurality of grayscale images, and the plurality of pixels may comprisegrayscale pixels. In some embodiments, each pixel in the plurality ofgrayscale images may have a normalized grayscale value.

The plurality of pixels in the image frames may be associated with aplurality of feature points. A feature point may correspond to a pointor an area on an object. In some embodiments, a feature point may berepresented by a single pixel in an image frame. For example, eachfeature point may have a 1:1 correspondence (or 1:1 correlation) with acorresponding pixel. In some embodiments, each feature point maydirectly correlate with a grayscale value of the corresponding pixel. Insome embodiments, a feature point may be represented by a cluster ofpixels in an image frame. For example, each feature point may have a 1:ncorrespondence (or 1:n correlation) with n pixels, where n is anyinteger greater than 1. The cluster of pixels may include 2, 3, 4, 5, 6,7, 8, 9, 10, or more pixels. All pixels can be individually analyzed,either simultaneously or sequentially. Likewise, all clusters of pixelscan be individually analyzed, either simultaneously or sequentially.Analysis of clusters of pixels can help to reduce the processing time(as well as processing power) required to analyze all pixels in an imageframe. Movement characteristics of the one or more pixel(s) may beanalyzed to determine one or more feature points associated with thosepixel(s), as described later in the specification.

In some particular embodiments, a feature point can be a portion of animage (e.g., an edge, corner, interest point, blob, ridge, etc.) that isuniquely distinguishable from the remaining portions of the image and/orother feature points in the image. Optionally, a feature point may berelatively invariant to transformations of the imaged object (e.g.,translation, rotation, scaling) and/or changes in the characteristics ofthe image (e.g., brightness, exposure). A feature point may be detectedin portions of an image that is rich in terms of informational content(e.g., significant 2D texture). A feature point may be detected inportions of an image that are stable under perturbations (e.g., whenvarying illumination and brightness of an image).

Feature points can be detected using various algorithms (e.g., texturedetection algorithm) which may extract one or more feature points fromimage data. The algorithms may additionally make various calculationsregarding the feature points. For example, the algorithms may calculatea total number of feature points, or “feature point number.” Thealgorithms may also calculate a distribution of feature points. Forexample, the feature points may be widely distributed within an image(e.g., image data) or a subsection of the image. For example, thefeature points may be narrowly distributed within an image (e.g., imagedata) or a subsection of the image. The algorithms may also calculate aquality of the feature points. In some instances, the quality of featurepoints may be determined or evaluated based on a value calculated byalgorithms mentioned herein (e.g., FAST, Corner detector, Harris, etc).

The algorithm may be an edge detection algorithm, a corner detectionalgorithm, a blob detection algorithm, or a ridge detection algorithm.In some embodiments, the corner detection algorithm may be a “Featuresfrom accelerated segment test” (FAST). In some embodiments, the featuredetector may extract feature points and make calculations regardingfeature points using FAST. In some embodiments, the feature detector canbe a Canny edge detector, Sobel operator, Harris &Stephens/Plessy/Shi-Tomasi corner detection algorithm, the SUSAN cornerdetector, Level curve curvature approach, Laplacian of Gaussian,Difference of Gaussians, Determinant of Hessian, MSER, PCBR, orGrey-level blobs, ORB, FREAK, or suitable combinations thereof.

In some embodiments, a feature point may comprise one or morenon-salient features. As used herein, non-salient features may refer tonon-salient regions or non-distinct (e.g., non-recognizable) objectswithin an image. Non-salient features may refer to elements within animage that are unlikely to stand out or catch attention of a humanobserver. Examples of non-salient features may include individual pixelsor groups of pixels that are non-distinct or non-identifiable to aviewer, when viewed outside of the context of their surrounding pixels.

In some alternative embodiments, a feature point may comprise one ormore salient features. Salient features may refer to salient regions ordistinct (e.g., recognizable) objects within an image. As used herein,salient features may refer to salient regions or distinct (e.g.,recognizable) objects within an image. Salient features may refer toelements within an image that are likely to stand out or catch attentionof a human observer. A salient feature may have semantic meaning.Salient features may refer to elements that may be identifiedconsistently under computer vision processes. A salient feature mayrefer to animate objects, inanimate objects, landmarks, marks, logos,obstacles, and the like within an image. A salient feature may bepersistently observed under differing conditions. For example, a salientfeature may be persistently identified (e.g., by a human observer or bycomputer programs) in images acquired from different points of view,during different times of the day, under different lighting conditions,under different weather conditions, under different image acquisitionsettings (e.g., different gain, exposure, etc), and the like. Forexample, salient features may include humans, animals, faces, bodies,structures, buildings, vehicles, planes, signs, and the like.

Salient features may be identified or determined using any existingsaliency calculating methods. For example, salient features may beidentified by contrast based filtering (e.g., color, intensity,orientation, size, motion, depth based, etc), using a spectral residualapproach, via frequency-tuned salient region detection, via a binarizednormed gradients for objectness estimation, using a context-aware topdown approach, by measuring visual saliency by site entropy rate, andthe like. For example, salient features may be identified in a saliencymap that is generated by subjecting one or more images to contrast basedfiltering (e.g., color, intensity, orientation, etc). A saliency map mayrepresent areas with feature contrasts. A saliency map may be apredictor where people will look. A saliency map may comprise a spatialheat map representation of features or fixations. For example, in asaliency map, salient regions may have a higher luminance contrast,color contrast, edge content, intensities, etc than non-salient regions.In some embodiments, salient features may be identified using objectrecognition algorithms (e.g., feature based methods, appearance basedmethods, etc). Optionally, one or more objects or types of patterns,objects, figures, colors, logos, outlines, etc may be pre-stored aspossible salient features. An image may be analyzed to identify salientfeatures that are pre-stored (e.g., an object or types of objects). Thepre-stored salient features may be updated. Alternatively, salientfeatures may not need to be pre-stored. Salient features may berecognized on a real time basis independent to pre-stored information.

In some embodiments, the imaging device may be mounted or co-located ona tracking device (not shown). The tracking device can be, for example,vehicles that are capable of traveling in the air, on land, on water, orwithin a water body. Examples of vehicles may include an aerial vehicle(e.g., a UAV), a land-bound vehicle (e.g., a car), a water-bound vehicle(e.g., a boat), etc. In some embodiments, the tracking device may be amobile device, a cell phone or smartphone, a personal digital assistant(PDA), a computer, a laptop, a tablet PC, a media content player, avideo game station/system, wearable devices such as a virtual realityheadset or a head mounted device (HMD), or any electronic device capableof capturing, providing or rendering image data, and/or identifying ortracking a target object based on the image data. The tracking devicemay further include software applications that allow the tracking deviceto communicate with and receive image data from the imaging device. Thetracking device may be configured to provide the image data to the imageanalyzer for image analysis. In some instances, the tracking device maybe self-propelled, can be stationary or moving, and may changeorientation (e.g., attitude) over time.

As another example, the tracking device can be a web server, anenterprise server, or any other type of computer server. The trackingdevice can be a computer programmed to accept requests (e.g., HTTP, orother protocols that can initiate data transmission) from the imageanalyzer and to serve the image analyzer with requested image data. Insome embodiments, the tracking device can be a broadcasting facility,such as free-to-air, cable, satellite, and other broadcasting facility,for distributing image data.

In some embodiments, the image data captured by the imaging device maybe stored in a media storage (not shown) before the image data isprovided to the image analyzer. The image analyzer may be configured toreceive the image data directly from the media storage. In someembodiments, the image analyzer may be configured to receive image dataconcurrently from both the imaging device and the media storage. Themedia storage can be any type of storage medium capable of storing imagedata of a plurality of objects. As previously described, the image datamay include video or still images. The video or still images may beprocessed and analyzed by the image analyzer, as described later in thespecification. The media storage can be provided as a CD, DVD, Blu-raydisc, hard disk, magnetic tape, flash memory card/drive, solid statedrive, volatile or non-volatile memory, holographic data storage, andany other type of storage medium. In some embodiments, the media storagecan also be a computer capable of providing image data to the imageanalyzer.

As another example, the media storage can be a web server, an enterpriseserver, or any other type of computer server. The media storage can becomputer programmed to accept requests (e.g., HTTP, or other protocolsthat can initiate data transmission) from the image analyzer and toserve the image analyzer with requested image data. In addition, themedia storage can be a broadcasting facility, such as free-to-air,cable, satellite, and other broadcasting facility, for distributingimage data. The media storage may also be a server in a data network(e.g., a cloud computing network).

In some embodiments, the media storage may be located on-board theimaging device. In some other embodiments, the media storage may belocated on-board the tracking device but off-board the imaging device.In some further embodiments, the media storage may be located on one ormore external devices off-board the tracking device and/or the imagingdevice. In those further embodiments, the media storage may be locatedon a remote controller, a ground station, a server, etc. Any arrange orcombination of the above components may be contemplated. In someembodiments, the media storage may communicate with the imaging deviceand the tracking device via a peer-to-peer network architecture. In someembodiments, the media storage may be implemented using a cloudcomputing architecture.

The image data may be provided (in the form of image signals 112) to theimage analyzer for image processing/analysis. In the example of FIG. 1,the image analyzer can be implemented as a software program executing ina processor and/or as hardware that analyzes the plurality of imageframes to identify at least one tracking feature relative to at leastone background feature from the plurality of feature points. Forexample, the image analyzer may be configured to analyze the imageframes to compute movement characteristics of the plurality of featurepoints, and to identify at least one tracking feature relative to atleast one background feature based on the movement characteristics ofthe plurality of feature points. The tracking feature may be associatedwith one or more target objects. The background feature may beassociated with one or more background objects.

The image analyzer may be configured to determine the relative positionsbetween the target object and the background object based on themovement characteristics of the plurality of feature points. The imagingdevice may be stationary or mobile. The background object is typicallystationary. The target object may be stationary or mobile. In someembodiments, the tracking feature and background feature may beidentified while at least one of the imaging device or the target objectis in motion or is capable of motion. At any given moment in time, theimaging device or the target object may be capable of moving and/orstopping. For instance a UAV supporting the imaging device may hover fora period of time before moving to another location.

In some embodiments, the image analyzer may be located remotely from theimaging device. For example, the image analyzer may be disposed in aremote server that is in communication with the imaging device. Theimage analyzer may be provided at any other type of external device(e.g., a remote controller for a tracking device, an object carried bythe target object, a reference location such as a base station, oranother tracking device), or may be distributed on a cloud computinginfrastructure. In some embodiments, the image analyzer and the mediastorage may be located on a same device. In other embodiments, the imageanalyzer and the media storage may be located on different devices. Theimage analyzer and the media storage may communicate either via wired orwireless connections. In some embodiments, the image analyzer may belocated on a tracking device. For example, the image analyzer may bedisposed in a housing of the tracking device. In some other embodiments,the image analyzer may be located on the target object. For example, theimage analyzer may be disposed on a body of the target object. In someother embodiments, the image analyzer may be located on the backgroundobject. For example, the image analyzer may be disposed on a body of thebackground object. In some further embodiments, the image analyzer maybe disposed at a base station that is in communication with the trackingdevice and/or the target object. The image analyzer may be locatedanywhere, as long as the image analyzer is capable of: (i) receiving aplurality of image frames captured at different times using an imagingdevice, (ii) analyzing the plurality of image frames to compute movementcharacteristics of the plurality of feature points, and (iii)identifying at least one tracking feature relative to at least onebackground feature based on the movement characteristics of theplurality of feature points. The image analyzer may communicate with oneor more of the aforementioned tracking device, target object, backgroundobject, base station, or any other device to receive image data fromwhich movement characteristics of a plurality of feature points can becomputed, and from which a tracking feature relative to a backgroundfeature can be identified.

In some embodiments, the resulting analysis of the image frames may beprovided (in the form of analyzed signals 122) to an output device (notshown). For example, the identified tracking feature and backgroundfeature may be depicted in one or more resulting image frames that aredisplayed on the output device. The resulting image frames may beencoded in the analyzed signals 122. The resulting image frames mayinclude annotations (e.g., labels, circled regions, different colorcoding, etc.) distinguishing the tracking feature from the backgroundfeature. The output device can be a display device such as, for example,a display panel, monitor, television, projector, or any other displaydevice. In some embodiments, the output device can be, for example, acell phone or smartphone, personal digital assistant (PDA), computer,laptop, desktop, a tablet PC, media content player, set-top box,television set including a broadcast tuner, video game station/system,or any electronic device capable of accessing a data network and/orreceiving analyzed image data from the image analyzer.

In some embodiments, the components 110 and 120 may be located onseparate discrete devices. In those embodiments, the devices (on whichcomponents 110 and 120 are respectively located) may be operativelyconnected to each other via a network or any type of communication linksthat allow transmission of data from one component to another. Thenetwork may include the Internet, Local Area Networks (LANs), Wide AreaNetworks (WANs), Bluetooth, Near Field Communication (NFC) technologies,networks based on mobile data protocols such as General Packet RadioServices (GPRS), GSM, Enhanced Data GSM Environment (EDGE), 3G, 4G, orLong Term Evolution (LTE) protocols, Infra-Red (IR) communicationtechnologies, and/or Wi-Fi, and may be wireless, wired, or a combinationthereof.

While shown in FIG. 1 as separate components that are operativelyconnected, it is noted that the imaging device and the image analyzermay be co-located in one device. For example, the image analyzer can belocated within or form part of the imaging device. Conversely, theimaging device can be located within or form part of the image analyzer.In some embodiments, at least one of the imaging device or the imageanalyzer may be co-located on a user device. In some embodiments, amedia storage may be located within or form part of the imaging device.In some embodiments, at least one of the imaging device or the imageanalyzer can be located within or form part of a mobile visual trackingdevice. The mobile visual tracking device may be mounted on (or enabledusing) an aerial vehicle, for example a UAV. It is understood that theconfiguration shown in FIG. 1 is for illustrative purposes only. Certaincomponents or devices may be removed or combined, and other componentsor devices may be added.

As previously described, the image analyzer may be configured to analyzethe plurality of image frames to compute movement characteristics of theplurality of feature points, and to identify at least one trackingfeature relative to at least one background feature based on themovement characteristics of the plurality of feature points. In someimplementations, the feature points may each correspond to a singlepixel or a group of pixels. Any description of analysis based on featurepoints may also apply to analysis based on individual pixels or groupsof pixels. This may occur without regard to any property of the pixel(s)(e.g., brightness, color, contrast, etc.). Alternatively, one or more ofsuch property of the pixel(s) may be taken into account. Theaforementioned steps can be implemented using an optical flow algorithm,and will be described in further detail with reference to FIG. 2. Theoptical flow algorithm may be performed using the image analyzer. Theoptical flow algorithm can be used to compute the motion of pixels orfeature points of an image sequence, and can provide a dense(point-to-point) pixel or feature point correspondence.

FIG. 2 illustrates the identification of tracking features andbackground features in exemplary images using the image analyzer of FIG.1, in accordance with some embodiments. Referring to FIG. 2, an imageanalyzer 204 may receive a plurality of image signals 212 from animaging device (e.g., imaging device 110 of FIG. 1). The image signals212 may comprise a first image frame 212-1 captured at time T1 and asecond image frame 212-2 captured at time T2, whereby time T2 may be apoint in time occurring after time T1. Although FIG. 2 depicts two imageframes, any number of images frames may be contemplated. For example, insome embodiments, the image signals 212 may comprise a plurality ofimage frames 212-1 to 212-n captured over a period of time starting fromT1 to Tn, where n may be any integer greater than 1.

In some embodiments, more than one image frame may be captured at aparticular time instance. For example, the image signals 212 maycomprise a plurality of image frames 212-1 captured at time T1, aplurality of image frames 212-2 captured at time T2, etc. The pluralityof image frames at each time instance may be averaged and transformedinto a single image frame associated with that particular time instance.In some embodiments, 1, 2, 3, 4, 5, or more image frames may be capturedevery second. In some embodiments, an image frame may be captured every2 second, 3 seconds, 4 seconds, 5 seconds, or more than 5 seconds. Theimage frames may be captured at a fixed frequency or at differentfrequencies. For example, a greater number of image frames may becaptured when the target object is moving quickly, and a fewer number ofimage frames may be captured when the target object is moving slowly. Insome embodiments, the image analyzer may be configured to analyze onlythose image frames that have different pixel (or feature point) movementcharacteristics between the image frames.

Each image frame may comprise a plurality of pixels that are associatedwith a plurality of feature points. As shown in FIG. 2, the featurepoints may be associated with target objects (e.g., a group of people)and background objects (e.g., buildings, trees, golf course, gasstation, etc.). In the example of FIG. 2, the target objects may belocated at a first position at time T1 (see first image frame 212-1) andmoved to a second position at time T2 (see second image frame 212-2).

The image analyzer may be configured to analyze the plurality of imageframes to compute movement characteristics of the plurality of featurepoints. The movement characteristics of the plurality of feature pointsmay comprise positional differences, and at least a velocity or anacceleration of each feature point. Comparing image frames 212-1 and212-2, it may be observed that the feature points associated with thebackground objects may have “moved” substantially from right to leftbetween the images at a velocity Vb′, whereas the feature pointsassociated with the target objects may have “moved” substantially fromleft to right between the images at a velocity Vt′. The apparenttranslation of the background objects in the image frames may beattributed to the fact that the imaging device may be in motion whencapturing the image frames.

The image analyzer may be further configured to identify at least onetracking feature relative to at least one background feature. This maycomprise differentiating, based on the movement characteristics of theplurality of feature points, a first set of feature points and a secondset of feature points from among the plurality of feature points. Thefirst set of feature points may have substantially a first movementcharacteristic, and the second set of feature points may havesubstantially a second movement characteristic different from the firstmovement characteristic. For example, in FIG. 2, the feature pointsassociated with the background objects may have substantially a firstmovement characteristic (e.g., right-to-left from image 212-1 to image212-2 at velocity Vb′), whereas the feature points associated with thetarget objects may have substantially a second movement characteristic(e.g., left-to-right from image 212-1 to image 212-2 at velocity Vt′).Accordingly, the image analyzer may identify the feature pointsassociated with the background objects as a first set of feature points,and the feature points associated with the target objects as a secondset of feature points. The image analyzer may be further configured toidentify background feature 214 as the first set of feature points andtracking feature 216 as the second set of feature points. By comparingthe movement characteristics of the feature points, the tracking featuremay be associated with the target objects, whereas the backgroundfeature may be associated with the background objects. The backgroundfeature may have substantially a same movement characteristic associatedwith the first movement characteristic of the first set of featurepoints. The tracking feature may have substantially a same movementcharacteristic associated with the second movement characteristic of thesecond set of feature points.

In some embodiments, the image analyzer can identify the trackingfeature and the background feature based only on the movementcharacteristics of the plurality of feature points. Accordingly, theimage analyzer can identify the tracking feature and the backgroundfeature independent of an object recognition method. For example, thebackground feature and the tracking feature may be defined independentof any object class. This is in contrast to conventional vision-basedtracking methods that typically identify features by classifying theminto one or more object classes, or fitting them to one or more knownmodels.

In some particular embodiments, after the image analyzer has identifiedthe tracking feature and the background feature, the image analyzer maybe further configured to categorize the tracking feature and thebackground feature into one or more object classes using an objectrecognition method. The object recognition method may comprisedetermining whether each of the tracking feature and the backgroundfeature belongs to one or more object classes. The object classes maycomprise a building object class, a landscape object class, a peopleobject class, an animal object class, and/or a vehicle object class. Theobject recognition method may be based on alignment models, invariantproperties, and/or parts decomposition.

In some embodiments, the image analyzer may be configured to analyze theplurality of image frames using a pixel-based approach. For example, inthose embodiments, the plurality of feature points may have a one-to-onecorrespondence to the plurality of pixels in the plurality of imageframes. In other words, each feature point may correspond to a uniquepixel. The image analyzer may be configured to analyze the plurality ofimage frames to compute movement characteristics of the plurality ofpixels. The movement characteristics of the plurality of pixels maycomprise positional differences, and at least one of a velocity or anacceleration of each pixel. Comparing image frames 212-1 and 212-2, itmay be observed that the pixels associated with the background objectshave “moved” substantially from right to left between the images at avelocity Vb′, whereas the pixels associated with the target objects have“moved” substantially from left to right between the images at avelocity Vt′. The apparent translation of the background objects in theimage frames may be attributed to the fact that the imaging device maybe in motion when capturing the image frames.

The image analyzer may be further configured to differentiate, based onthe movement characteristics of the plurality of pixels, a first set ofpixels and a second set of pixels from among the plurality of pixels.The first set of pixels may have substantially a first movementcharacteristic, and the second set of pixels may have substantially asecond movement characteristic different from the first movementcharacteristic. For example, in FIG. 2, the pixels associated with thebackground objects may have substantially a first movementcharacteristic (e.g., right-to-left from image 212-1 to 212-2 atvelocity Vb′), whereas the pixels associated with the target objects mayhave substantially a second movement characteristic (e.g., left-to-rightfrom image 212-1 to 212-2 at velocity Vt′). Accordingly, the imageanalyzer may identify the pixels associated with the background objectsas a first set of pixels, and the pixels associated with the targetobjects as a second set of pixels. The image analyzer may be furtherconfigured to identify the background feature 214 as the first set ofpixels and the tracking feature 216 as the second set of pixels. Bycomparing the movement characteristics of the pixels, the trackingfeature may be associated with the target objects, whereas thebackground feature may be associated with the background objects. Thebackground feature may have substantially a same movement characteristicassociated with the first movement characteristic of the first set ofpixels. The tracking feature may have substantially a same movementcharacteristic associated with the second movement characteristic of thesecond set of pixels.

As mentioned above, the image analyzer may be configured to analyze theplurality of image frames using the above-described pixel-basedapproach. The pixel-based approach can be used in illuminatedenvironments, and also in low-light or dark environments. For example,the image analyzer can analyze thermal images (thermograms) obtainedfrom a thermal imaging device using the pixel-based approach, andidentify the background feature and the tracking feature based onmovement characteristics of the pixels in the thermal images. Each pixelin the thermal images may be indicative of an amount of infrared energyemitted, transmitted, and/or reflected at a feature point in the targetobjects and the background objects. The pixel-based approach foranalyzing thermal images may be well-suited for low-light or darkenvironments, since optical images captured in low-light or darkenvironments tend to have low brightness/contrast that makes itdifficult to track the movement characteristics between differentpixels.

In some embodiments, the image analyzer may be further configured toidentify the background feature 214 by generating one or more contour(s)surrounding the first set of pixels, and to identify the trackingfeature 216 by generating another contour surrounding the second set ofpixels, as shown by the dotted circled regions in FIG. 2. The contoursserve to distinguish the tracking feature 216 from the backgroundfeature 214. The contours may include different colors, patterns, orshading to differentiate the tracking feature from the backgroundfeature. The image analyzer may be further configured to generate aresulting image frame 213 depicting the identified tracking feature andbackground feature, as shown in FIG. 2. As previously mentioned, theresulting image frame may be provided (for example, in the form ofanalyzed signals 122) to an output device, such as a display device.

FIG. 3 illustrates different movement characteristics of a pixel in theimage frames, in accordance with some embodiments. As previouslydescribed, the plurality of image frames may comprise at least a firstimage frame and a second image frame. The image analyzer may beconfigured to compute the movement characteristic of each pixel, foreach pixel appearing in the first image frame and the second imageframe. For example, the image analyzer may be configured to identify aposition of each pixel in the first image frame and its correspondingposition in the second image frame, and compute the movementcharacteristic of each pixel based on a difference between its positionsin the first and second image frames. In some embodiments, the imageanalyzer may be configured to map the plurality of image frames,generate a transformation for each pixel based on the mapping, andcompute the movement characteristic of each pixel using itstransformation. The movement characteristic of a pixel appearing in thefirst and second frames may comprise of a velocity of the pixel. Thevelocity of the pixel may be calculated using the following equation:

V _(p) =C·(T _(ref) −T _(current)),

where V_(p) is the velocity of the pixel, C is a speed constant, T_(ref)is a reference transformation based on the position of the pixel in thefirst image frame, and T_(current) is a current transformation based onthe position of the pixel in the second image frame. The velocity V_(p)may include both a vector component and a scalar component. Anacceleration A_(p) of the pixel may be calculated by the change invelocity of the pixel over time:

A _(p) =ΔV _(p) /ΔT

The velocity of a pixel may further comprise a linear velocity and/or anangular velocity of the pixel. The acceleration of a pixel may furthercomprise a linear acceleration and/or an angular acceleration of thepixel. For example, referring to FIG. 3 (Part A), the movementcharacteristic of a pixel may comprise a linear velocity and/or a linearacceleration when the pixel translates along a direction between itsposition in the first image frame to its position in the second imageframe. As shown in FIG. 3 (Part B), the pixel may be at a first positionin a first image frame 312-1 at time T1, and may have moved to a secondposition in a second image frame 312-2 at time T2. In the example ofFIG. 3 (Part B), the movement of the pixel from the first position tothe second position may be via translation (denoted by a straight arrowline), and may comprise a linear velocity V_(p_linear).

In some embodiments, for example as shown in FIG. 3 (Part C), themovement characteristic of a pixel may comprise an angular velocity ωand/or an angular acceleration A_(p_angular) when the pixel is rotatingabout a point O between its position in the first image frame to itsposition in the second image frame. A linear speed of the pixel may begiven by V_(p_linear)=R. ω, where R is a distance from the pixel to thepoint O (or radius of a circle with center point O). As shown in FIG. 3(Part D), the pixel may be at a first position in a first image frame312-1 at time T1, and moved to a second position in image frame 312-2 attime T2. In the example of FIG. 3 (Part D), the pixel may move from thefirst position to the second position in a curvilinear direction(denoted by a curved arrow line) at an angular velocity ω.

As previously described, the tracking feature 216 may be identified bygenerating a contour surrounding the second set of pixels (or featurepoints associated with the target objects). In some embodiments, a sizeof the contour may change, for example, as shown in FIGS. 4, 5, 6, and7.

FIG. 4 illustrates an exemplary sequence of image frames whereby thesize of the contour surrounding a tracking feature may increase, inaccordance with some embodiments. Specifically, FIG. 4 illustrates thatthe size of the contour surrounding the tracking feature may increasewhen more target objects (e.g., people, vehicles, animals, etc.) join apre-existing group of target objects. As shown in FIG. 4, a first imageframe 412-1, a second image frame 412-2, and a third image frame 412-3may be captured by an imaging device at times T1, T2, and T3,respectively. The first image frame may correspond, for example, to theresulting image frame 213 shown in FIG. 2. The first image frame maycomprise a first tracking feature 416 comprising a group of targetobjects that have been previously identified by the image analyzer. Attime T2, additional tracking features 416-1 and 416-2 may be identifiedby the image analyzer at the left portion and bottom right portion ofthe second image frame. The additional tracking features may movetowards the first tracking feature and converge with the first trackingfeature at time T3, as illustrated by the third image frame 412-3. Thesize of the pixels (or feature points) associated with the targetobjects may increase from T1 and T3 due to the convergence of thetracking features. Accordingly, the size of the contour surroundingthose pixels (or tracking features) may increase as the number of targetobjects increases in the image frame. In some embodiments, the convergedtracking features may be collectively treated as a common group oftracking features. In some alternative embodiments, the image analyzermay continue to track each individual tracking feature 416-1, 416-2, and416-3 even after the tracking features have apparently merged into asingle group. In some embodiments, whether the tracking features aretracked individually or collectively as a group may depend on a distancebetween adjacent tracking features. For example, if the distance betweenadjacent features is greater than a predetermined distance, the trackingfeatures may be tracked individually since the tracking features mayhave a low spatial density. Conversely, if the distance between adjacentfeatures is less than a predetermined distance, the tracking featuresmay be tracked collectively as a single group since the trackingfeatures may have a high spatial density. The predetermined distance maybe determined based on a size, shape, or areal density of the targetobjects. In some embodiments, when the size of the contour surroundingthe tracking feature starts to increase, the imaging device may move toa higher vertical location relative to the target objects, or a furtherlateral distance away from the target objects, so that the trackingfeature can be substantially positioned in the field-of-view of theimaging device or in a target region of the image frames captured by theimaging device.

FIG. 5 illustrates an exemplary sequence of image frames whereby thesize of the contour surrounding a tracking feature may decrease, inaccordance with some embodiments. Specifically, FIG. 5 illustrates thatthe size of the contour surrounding the tracking feature may decreasewhen one or more target objects leaves a pre-existing group of targetobjects. As shown in FIG. 5, a first image frame 512-1, a second imageframe 512-2, and a third image frame 512-3 may be captured at times T1,T2, and T3, respectively. The first image frame may correspond, forexample, to the resulting image frame 213 shown in FIG. 2. The firstimage frame may comprise a first tracking feature 516 comprising a groupof target objects that have been previously identified by the imageanalyzer. Some of the target objects may begin to diverge (disperse)from the group at time T2 and may have moved outside of thefield-of-view of the imaging device at time T3, as illustrated by thethird image frame 512-3. The size of the pixels (or feature points)associated with the target objects may decrease from T1 and T3 due tothe reduction in size of the group of target objects. Accordingly, thesize of the contour surrounding those pixels (or tracking feature) maydecrease as the number of target objects decreases. In some embodiments,when the size of the contour surrounding the tracking feature starts todecrease, the imaging device may move to a lower vertical locationrelative to the target objects, or a shorter lateral distance away fromthe target objects, so that the tracking feature can be substantiallypositioned in the field-of-view of the imaging device or in a targetregion of the image frames captured by the imaging device.

In some embodiments, the size of the contour surrounding a trackingfeature may be defined by the positions of the outermost target objectswithin the group. FIG. 6 illustrates an exemplary sequence of imageframes whereby the size of the contour surrounding a tracking featuremay increase, in accordance with some other embodiments. For example, asshown in FIG. 6, a first image frame 612-1, a second image frame 612-2,and a third image frame 612-3 may be captured at times T1, T2, and T3,respectively. The first image frame may correspond, for example, to theresulting image frame 213 shown in FIG. 2. The first image frame maycomprise a first tracking feature 616 comprising a group of targetobjects that have been previously identified by the image analyzer. Thetarget objects may begin to diverge from the group at time T2. However,those target objects still remain in the field-of-view of the imagingdevice at time T3, as illustrated by the third image frame 612-3. Thesize of the pixels (or feature points) associated with the targetobjects may increase from T1 and T3 due to the divergence of the groupof target objects. Accordingly, the size of the contour surroundingthose pixels (or tracking feature) may increase as the target objectsbecome more spaced apart to occupy a larger area. In some embodiments,when the size of the contour surrounding the tracking feature starts toincrease, the imaging device may move to a higher vertical locationrelative to the target objects, or a further lateral distance away fromthe target objects, so that the tracking feature can be substantiallypositioned in the field-of-view of the imaging device or in a targetregion of the image frames captured by the imaging device.

Similarly, FIG. 7 illustrates an exemplary sequence of image frameswhereby the size of the contour surrounding a tracking feature maydecrease, in accordance with some other embodiments. For example, asshown in FIG. 7, a first image frame 712-1, a second image frame 712-2,and a third image frame 712-3 may be captured at times T1, T2, and T3,respectively. The first image frame may correspond, for example, toimage frame 612-3 shown in FIG. 6. The first image frame may comprise afirst tracking feature 716 comprising a group of target objects thathave been previously identified by the image analyzer. The targetobjects may begin to converge at time T2. At time T3, the target objectsmay have converged to a smaller area, as illustrated by the third imageframe 712-3. The size of the pixels (or feature points) associated withthe target objects may decrease from T1 and T3 due to the convergence ofthe group of target objects. Accordingly, the size of the contoursurrounding those pixels (or tracking feature) may decrease as thetarget objects converge onto a smaller area. In some embodiments, whenthe size of the contour surrounding the tracking feature starts todecrease, the imaging device may move to a lower vertical locationrelative to the target objects, or a shorter lateral distance away fromthe target objects, so that the tracking feature can be substantiallypositioned in the field-of-view of the imaging device or in a targetregion of the image frames captured by the imaging device.

In the examples of FIGS. 4, 5, 6, and 7, the target objects maycorrespond to group(s) of people. However, it is noted that the targetobjects are not limited thereto. The size and/or shape of the contoursurrounding the tracking feature may change with the movement,convergence, divergence, addition, and/or subtraction of one or moretarget objects of different object classes (e.g., a combination ofpeople, vehicles, animals, etc.), for example as shown in FIG. 8. Anychange in the size and/or shape of the contour surrounding the trackingfeature may be contemplated. The size and/or shape of the contour may beamorphous and may change as the number of target objects changes (i.e.,change in areal density of the target objects), or when the targetobjects move collectively in a random manner, for example as shown inFIG. 9. In the example of FIG. 9, each contour may contain a pluralityof feature points (or pixels) associated with the target objects.Individual target objects within a contour may or may not be identifiedor tracked, as long as the target objects have substantially a samemovement characteristic. In some embodiments, the size of the contourmay increase when the imaging device is located closer to the targetobjects (due to magnification). Conversely, the size of the contour maydecrease when the imaging device is located further away from the targetobjects (due to de-magnification).

In some embodiments, the image analyzer may be configured to determinethat the tracking feature is moving relative to the background feature,based on the movement characteristics of the feature points. Forexample, referring back to FIG. 2, the image analyzer can determine thatthe tracking feature 216 is moving relative to the background feature214 based on the movement characteristics of the feature pointsdetermined from image frames 212-1 and 212-2 at times T1 and T2.

FIGS. 10, 11, 12, 13, and 14 illustrate the tracking/following of targetobjects by an imaging device, in accordance with different embodiments.In the examples of FIGS. 10, 11, 12, 13, and 14, the imaging device maybe stationary, with the target objects and the background objectslocated in the field-of-view of the imaging device. If the imagingdevice is stationary, the background feature may not move at all. Asshown in FIG. 10, the imaging device may be located directly above acentral region comprising the target objects and the background objects.For example, the imaging device of FIG. 10 may be mounted on a UAV thatis hovering at a fixed location directly above the target objects andthe background objects. As shown in FIG. 11, the imaging device may belocated above and at an angle relative to the target objects and thebackground objects. For example, the imaging device of FIG. 11 may bemounted on a UAV that is hovering at a fixed location above and at anangle relative to the target objects and the background objects. Asshown in FIG. 12, the imaging device may be located on the ground at adistance from the target objects and the background objects. The imagingdevice of FIG. 12 may be mounted on a stationary structure 1204 such asa tower, a pole, a building, etc. In some embodiments, the imagingdevice of FIG. 12 may be mounted on an extension pole to which theimaging device is affixed. The extension pole may be held by a user orplanted at a fixed location. In some embodiments, the imaging device maybe capable of rotating about a fixed point (e.g., a security camera).

In the examples of FIGS. 10, 11, and 12, a contour surrounding thetracking feature in an image frame may remain relatively constant as thetarget objects move from one location to another. In contrast, in theexamples of FIGS. 13 and 14, a contour surrounding the tracking featuremay change as the target objects move from one location to another. Forexample, as shown in FIGS. 13 and 14, the size and shape of a contoursurrounding the tracking feature may change as the target objects movefrom a first location at time T1 to a second location at time T2 and toa third location at time T3. In the example of FIG. 13, the imagingdevice may be may be mounted on a UAV that is hovering at a fixedlocation above and at an angle relative to the target objects and thebackground objects. In contrast, in the example of FIG. 14, the imagingdevice may be mounted on a stationary structure 1404 such as a tower, apole, a building, etc. In some embodiments, the imaging device of FIG.14 may be mounted on an extension pole to which the imaging device isaffixed. The extension pole may be held by a user or planted at a fixedlocation.

In the examples of FIGS. 10, 11, 12, 13, and 14, the imaging device canbe used to track the target objects. The image analyzer may beconfigured to identify the tracking feature (target objects) relative tothe background feature (background objects) in the image frames, aspreviously described. After the tracking feature and the backgroundfeature have been identified, the target objects can be tracked as theymove from one location to another location, based on the real-timemovement characteristics of the pixels (or feature points) between imageframes. In some embodiments, the image analyzer may be configured totrack the target objects as the move from one location to anotherlocation. In other embodiments, a tracking device may be configured totrack the target objects, based on the tracking feature and backgroundfeature that have been identified in the image frames by the imageanalyzer.

In some embodiments, the optical flow algorithm described in FIGS. 1-14may be implemented on a mobile platform. FIG. 15 illustrates an exampleof a mobile platform that may also serve as a visual tracking system.Specifically, FIG. 15 illustrates a visual tracking system 1500comprising an image analyzer for computing movement characteristics of aplurality of pixels based on motion characteristics of an imagingdevice, in accordance with some embodiments. In the embodiment of FIG.15, an imaging device 1510 may be capable of motion. For example, theimaging device may be mounted or supported on a UAV. The visual trackingsystem may further comprise a motion sensing module 1530 configured tosense motion of the imaging device, and to provide motion signals 1532to image analyzer 1520. The motion signals may include motioncharacteristics of the imaging device.

In the example of FIG. 15, the image analyzer may be configured tosupport visual tracking of one or more target objects. The imagingdevice may be configured to capture image frames of objects 1502. Theimage analyzer may be configured to receive a plurality of image signals1512 from the imaging device. The image signals may be indicative of aplurality of image frames (e.g. a first image frame 1512-1 and a secondimage frame 1512-2) captured by the imaging device over a period of time(e.g., at times T1 and T2, respectively) while the imaging device is inmotion. Each image frame may comprise a plurality of pixels. The imageanalyzer may be further configured to obtain the motion characteristicsof the imaging device based on the plurality of motion signals, and toanalyze the plurality of image signals based on the motioncharacteristics of the imaging device, so as to compute movementcharacteristics associated with the plurality of pixels. The computedmovement characteristics may be encoded in analyzed signals 1522 thatare output from the image analyzer. The aforementioned steps can beimplemented using an optical flow algorithm, and will be described infurther detail with reference to FIG. 16. Specifically, FIG. 16illustrates the computation of movement characteristics of a pluralityof pixels in exemplary images using the image analyzer of FIG. 15, inaccordance with some embodiments.

Referring to FIG. 16, an image analyzer (e.g., image analyzer 1520 ofFIG. 15) may receive a plurality of image signals from an imaging device1610. The image signals may comprise a first image frame 1612-1 capturedat time T1 at location 1 and a second image frame 1612-2 captured attime T2 at location 2, whereby time T2 may be a point in time occurringafter time T1, and locations 1 and 2 are different locations each havinga unique set of spatial coordinates. Although FIG. 16 depicts two imageframes, any number of images frames may be contemplated. For example, insome embodiments, the image signals may comprise a plurality of imageframes 1612-1 to 1612-n captured over a period of time starting from T1to Tn at respective locations 1 to m, where m and n may be any integergreater than 1.

In some embodiments, more than one image frame may be captured at aparticular time instance. For example, the image signals may comprise aplurality of image frames 1612-1 captured at time T1, a plurality ofimage frames 1612-2 captured at time T2, and so forth. The plurality ofimage frames at each time instance may be averaged and transformed intoa single image frame associated with that particular time instance. Insome embodiments, a greater number of image frames may be captured whenthe target object and imaging device are moving quickly, and few numberof image frames may be captured while the target object and/or theimaging device are moving slowly.

Each image frame may comprise a plurality of pixels that are associatedwith a plurality of feature points. As shown in FIG. 16, the featurepoints may be associated with target objects (e.g., a group of people)and background objects (e.g., buildings, trees, golf course, gasstation, etc.). In the example of FIG. 16, the target objects may belocated at a first position at time T1 (see first image frame 1612-1)and moved to a second position at time T2 (see second image frame1612-2).

The plurality of pixels may be associated with a plurality of featurepoints. The image analyzer may be configured to analyze the plurality ofimage signals based on the motion characteristics of the imaging device.For example, the image analyzer may be configured to correlate theplurality of image frames to one another based on the motioncharacteristics of the imaging device. The image analyzer may be furtherconfigured to identify at least one tracking feature relative to atleast one background feature based on the movement characteristicsassociated with the plurality of pixels.

For example, referring to FIG. 16, the imaging device may move along thepositive (+) x-axis direction with speed Vi from location 1 to location2. Accordingly, the background feature in the image frames willtranslate along the negative (−) x-axis direction with speed Vb′, sincethe imaging device is moving relative to the stationary backgroundobjects. Speed Vb′ may be proportional to speed Vi by a scalingconstant, depending on a distance of the imaging device to eachbackground object, the amount of distance traveled by the imagingdevice, and the field-of-view of the imaging device. Accordingly, thespeed Vb′ at which the background feature translate across the imageframes may be a function of the speed Vi at which imaging device movesin 3-dimensional space. Subsequently, the image analyzer can identifythe background features, by identifying feature points that move acrossthe image frames at a speed Vb′ that is scaled in proportion to thespeed Vi and that is opposite to the direction in which the imagingdevice travels.

Since the target objects are moving relative to the stationarybackground objects, the tracking feature associated with the targetobjects will move at a velocity different from that of the backgroundfeature. This difference in movement between the target feature and thebackground feature is depicted in the image frames. In the example ofFIG. 16, the target objects may move at a speed Vt in a directiondifferent from that of the imaging device. When the motion of the targetobjects is captured in the image frames, the tracking feature may beobserved to move at a speed Vt′ at an angle θ relative to the positivex-axis direction. Accordingly, the image analyzer can identify thetarget feature, by identifying features points that move across imageframes with a speed/direction that is different from those featurepoints associated with the background feature.

The background feature may be associated with a first set of pixelshaving substantially a first movement characteristic, and the trackingfeature may be associated with a second set of pixels havingsubstantially a second movement characteristic. The movementcharacteristics associated with the plurality of pixels may comprise atleast one of a velocity and an acceleration of each pixel as measuredacross the plurality of image frames. The velocity of each pixel mayfurther comprise a linear (translational) velocity and/or an angularvelocity of each pixel. The linear velocity of each pixel may comprise alinear direction and a linear speed of each pixel.

The motion characteristics of the imaging device may comprise at leastone of an attitude, an instantaneous position, a velocity, and anacceleration of the imaging device. The velocity of the imaging devicemay further comprise a linear velocity and/or an angular velocity of theimaging device. The linear velocity of the imaging device may comprise alinear direction and a linear speed of the imaging device. The firstlinear direction of the first set of pixels may be associated with thelinear direction of the imaging device. The first linear speed of thefirst set of pixels (associated with the background feature) may beproportional to the linear speed of the imaging device by a speedconstant. The angular velocity of the imaging device may comprise arotational direction and a rotational speed of the imaging device. Acurvilinear direction of the first set of pixels may be associated withthe rotational direction of the imaging device. The curvilineardirection of the first set of pixels may be proportional to therotational speed of the imaging device by a speed constant. Theacceleration of the imaging device may further comprise a linearacceleration and/or an angular acceleration of the imaging device. Thelinear acceleration of the first set of pixels may be associated withthe linear acceleration of the imaging device. The angular accelerationof the first set of pixels may be associated with the angularacceleration of the imaging device.

In some embodiments, the instantaneous position of the imaging devicemay be determined using a range-finding and/or locating device. Therange-finding and/or locating device may be a Global Positioning System(GPS) device. In some embodiments, the range-finding and/or locatingdevice may be a time-of-flight camera that is capable of measuringdistances between the imaging device and the target objects/backgroundobjects. The instantaneous position of the imaging device may bedetermined relative to physical locations of the background objects. Insome embodiments, the image analyzer may be configured to calculate ascaling factor based on the instantaneous position of the imaging deviceand the physical locations of the background objects. In someembodiments, the image analyzer may be further configured to compute themovement characteristic of each pixel using the motion characteristicsof the imaging device and the scaling factor. In some embodiments, themotion characteristics of the imaging device may be determined usingsensors such as location sensors (e.g., global positioning system (GPS)sensors, mobile device transmitters enabling location triangulation),vision sensors (e.g., imaging devices capable of detecting visible,infrared, or ultraviolet light, such as cameras), proximity or rangesensors (e.g., ultrasonic sensors, lidar, time-of-flight or depthcameras), inertial sensors (e.g., accelerometers, gyroscopes, inertialmeasurement units (IMUs)), altitude sensors, attitude sensors (e.g.,compasses) pressure sensors (e.g., barometers), audio sensors (e.g.,microphones), and/or field sensors (e.g., magnetometers, electromagneticsensors).

Based on the motion characteristics of the imaging device, and themovement characteristics of the background feature and target feature,the image analyzer can determine the movement of the target objectsrelative to the background objects and the imaging device. For example,the image analyzer can detect the directions and speeds at which thetarget objects are moving relative to the background objects and theimaging device. FIGS. 17, 18, and 19 illustrate different embodiments inwhich an imaging device is tracking a group of target objects.Specifically, FIG. 17 illustrates a visual tracking system 1700 in whichan imaging device is moving at speed Vi and the target objects aremoving at speed Vt in substantially a same direction, where Vi issubstantially the same as Vt (Vi≅Vt).

FIGS. 18 and 19 illustrate embodiments in which the target objects andthe imaging device may be moving in a substantially same direction butat different speeds. In visual tracking system 1800 of FIG. 18, theimaging device may be moving slower than the target objects. Forexample, the imaging device may be moving at speed Vi and the targetobjects may be moving at speed Vt, where Vi may be less than Vt (Vi<Vt).Conversely, in visual tracking system 1900 of FIG. 19, the imagingdevice may be moving faster than the target objects. For example, theimaging device may be moving at speed Vi and the target objects may bemoving at speed Vt, where Vi may be greater than Vt (Vi>Vt). Thedifferent embodiments depicted in FIGS. 17, 18, and 19 may be depictedschematically in FIG. 20. Each part in FIG. 20 may correspond todifferent relative movement between the imaging device and the targetobjects. For example, FIG. 20 (Part A) may correspond to the embodimentin FIG. 17; FIG. 20 (Part B) may correspond to the embodiment in FIG.18; and FIG. 20 (Part C) may correspond to the embodiment in FIG. 19.

Referring to FIG. 20 (Part A), an imaging device may capture a firstimage frame 2012-1 at time T1 and a second image frame 2012-2 at timeT2. The imaging device and the target objects may move at substantiallya same speed in a substantially same direction. For example, the imagingdevice may move at a speed Vi and the target objects may move at a speedVt along the positive x-axis direction, whereby Vi and Vt may besubstantially the same (Vi≅Vt). As previously described, the speed Vb′at which the background feature translates across the image frames maybe a function of the speed Vi at which the imaging device moves in a3-dimensional space (in this case, along the positive x-axis direction).The speed Vt′ at which the target feature translates across the imageframes may be a function of the speed Vt at which the target objectsmove in a 3-dimensional space (in this case, also along the positivex-axis direction). Since the imaging device is moving relative to thebackground objects, the background feature in the image frames maytranslate at speed Vb′ in the opposite direction in which the imagingdevice is moving, as shown in FIG. 20 (Part A). The background featureand target feature may translate at substantially a same speed (Vb′≅Vt′)and by a same distance between the first and second image frames, but inopposite directions to each other. Based on the movement characteristicsof the feature points in FIG. 20 (Part A), the image analyzer candetermine that the imaging device and the target objects are moving atsubstantially a same speed in a substantially same direction.

In some embodiments, the imaging device and the target objects may movein substantially a same direction but at different speeds. For example,referring to FIG. 20 (Part B), the imaging device may move faster thanthe target objects. Specifically, the imaging device may move at a speedVi and the target objects may move at a speed Vt along the positivex-axis direction, whereby Vi is greater than Vt (Vi>Vt). Accordingly,the background feature may translate at speed Vb′ in the negative x-axisdirection between the first and second image frames, and the targetfeature may translate at speed Vt′ in the positive x-axis directionbetween the first and second image frames, where Vt′<Vb′. Based on themovement characteristics of the feature points in FIG. 20 (Part B), theimage analyzer can determine that the imaging device and the targetobjects are moving in substantially a same direction, and that thetarget objects are moving slower than the imaging device.

In some cases, for example referring to FIG. 20 (Part C), the imagingdevice may be moving slower than the target objects. Specifically, theimaging device may move at a speed Vi and the target objects may move ata speed Vt along the positive x-axis direction, whereby Vi is less thanVt (Vi<Vt). Accordingly, the background feature may translate at speedVb′ in the negative x-axis direction between the first and second imageframes, and the target feature may translate at speed Vt′ in thepositive x-axis direction between the first and second image frames,where Vt′>Vb′. Based on the movement characteristics of the featurepoints in FIG. 20 (Part C), the image analyzer can determine that theimaging device and the target objects are moving in substantially a samedirection, and that the target objects are moving faster than theimaging device.

In some embodiments, the image analyzer can detect that the targetobjects may be stationary or at rest. For example, referring to FIG. 21(Part A), the imaging device may capture a first image frame 2112-1 attime T1 and a second image frame 2112-2 at time T2. The imaging devicemay move at a speed Vi along the positive x-axis direction. However, thetarget objects may be stationary or at rest. Accordingly, the backgroundfeature may translate at speed Vb′ in the negative x-axis direction, andthe target feature may translate at speed Vt′ in the negative x-axisdirection, whereby Vb′ is substantially equal to Vt′ (Vb′Vt′). Since thetarget feature and the background feature are moving in substantially asame direction at substantially a same speed, this means that there isno relative motion between the target objects and the backgroundobjects. Accordingly, based on the movement characteristics of thefeature points in FIG. 21 (Part A), the image analyzer can determinethat the target objects are stationary or at rest. The embodiment ofFIG. 21 (Part A) may be based on an assumption that the target objectshave been previously identified at some other time instance based ontheir movement relative to the background objects.

In the embodiment of FIG. 20, the imaging device and the target objectsmay be moving in substantially a same direction. In some instances, theimaging device and the target objects can also move in oppositedirections, for example as illustrated in FIG. 21 (Parts B and C).

Referring to FIG. 21 (Part B), the imaging device may be moving fasterthan the target objects but in opposite directions. Specifically, theimaging device may move at a speed Vi along the positive x-axisdirection and the target objects may move at a speed Vt along thenegative x-axis direction, whereby Vi is greater than Vt (Vi>Vt).Accordingly, the background feature may translate at speed Vb′ in thenegative x-axis direction between the first and second image frames, andthe target feature may translate at speed Vt′ in the negative x-axisdirection between the first and second image frames, where Vt′<Vb′.Based on the movement characteristics of the feature points in FIG. 21(Part B), the image analyzer can determine that the imaging device andthe target objects are moving in substantially opposite directions, andthat the target objects are moving slower than the imaging device.

Likewise, referring to FIG. 21 (Part C), the imaging device may bemoving slower than the target objects but in opposite directions.Specifically, the imaging device may move at a speed Vi along thepositive x-axis direction and the target objects may move at a speed Vtalong the negative x-axis direction, whereby Vi is less than Vt (Vi<Vt).Accordingly, the background feature may translate at speed Vb′ in thenegative x-axis direction between the first and second image frames, andthe target feature may translate at speed Vt′ in the negative x-axisdirection between the first and second image frames, where Vt′>Vb′.Based on the movement characteristics of the feature points in FIG. 21(Part C), the image analyzer can determine that the imaging device andthe target objects are moving in substantially opposite directions, andthat the target objects are moving faster than the imaging device.

In some embodiments, the imaging device and the target objects may bemoving in directions that are oblique to one another, as illustrated inFIG. 22.

For example, referring to FIG. 22 (Part A), the imaging device and thetarget objects may move at substantially a same speed in directions thatare oblique to one another. For example, the imaging device may move ata speed Vi along the positive x-axis direction and the target objectsmay move at a speed Vt in a direction that is oblique to the positivex-axis direction. Vi and Vt may be substantially the same (Vi≅Vt).Accordingly, the background feature may translate at speed Vb′ in thenegative x-axis direction between the first and second image frames, andthe target feature may translate at speed Vt′ in an oblique directionbetween the first and second image frames, where Vt′Vb′. Based on themovement characteristics of the feature points in FIG. 22 (Part A), theimage analyzer can determine that the imaging device and the targetobjects are moving in directions that are oblique to one another, andthat the target objects and the imaging device are moving atsubstantially the same speed.

In some embodiments, the imaging device and the target objects may movein different directions and at different speeds. For example, in someinstances, the imaging device and the target objects may move indirections that are oblique to one another, and the imaging device maymove faster than the target objects. As shown in FIG. 22 (Part B), theimaging device may move at a speed Vi along the positive x-axisdirection and the target objects may move at a speed Vt in a directionthat is oblique to the positive x-axis direction. Vi may be greater thanVt (Vi>Vt). Accordingly, the background feature may translate at speedVb′ in the negative x-axis direction between the first and second imageframes, and the target feature may translate at speed Vt′ in an obliquedirection between the first and second image frames, where Vt′<Vb′.Based on the movement characteristics of the feature points in FIG. 22(Part B), the image analyzer can determine that the imaging device andthe target objects are moving in directions that are oblique to oneanother, and that the target objects are moving slower than the imagingdevice.

In some other instances, the imaging device and the target objects maymove in directions that are oblique to one another, and the imagingdevice may be moving slower than the target objects. Referring to FIG.22 (Part C), the imaging device may move at a speed Vi along thepositive x-axis direction and the target objects may move at a speed Vtin a direction that is oblique to the positive x-axis direction. Vi maybe less than Vt (Vi<Vt). Accordingly, the background feature maytranslate at speed Vb′ in the negative x-axis direction between thefirst and second image frames, and the target feature may translate atspeed Vt′ in an oblique direction between the first and second imageframes, where Vt′>Vb′. Based on the movement characteristics of thefeature points in FIG. 22 (Part C), the image analyzer can determinethat the imaging device and the target objects are moving in directionsthat are oblique to one another, and that the target objects are movingfaster than the imaging device.

As previously described, the imaging device and the target objects maymove in different directions. The different directions may includedirections that are parallel to one another, oblique to one another,that form an acute angle with one another, or that form an obtuse anglewith one another. In some instances, the different directions mayinclude directions that are perpendicular to one another. Anyorientation of the moving directions of the imaging device and thetarget objects may be contemplated.

In the embodiments of FIGS. 20, 21, and 22, the imaging devices and thetarget objects move linearly, which result in a translation of thebackground feature and the target feature between image frames. In someembodiments, the imaging devices and/or the target objects may havenon-linear motion characteristics. For example, the imaging devicesand/or the target objects may move in a curvilinear manner along an arc,which may result in a rotation of the background feature and/or thetarget feature between image frames.

FIG. 23 illustrates an embodiment in which an imaging device 2310 istracking a target object 2316 in a curvilinear manner along an arc. Theimaging device and the target object may move at different speeds alongthe arc. For example, at time T1, the imaging device and the targetobject may be in a first location and separated by a distance D1. Attime T2, the imaging device and the target object may be in a secondlocation and separated by a distance D2, where D2 is greater than D1. Inother words, an angular speed of the target object may be greater thanan angular speed of the imaging device between times T1 to T2. The imageanalyzer may be configured to analyze the non-linear motioncharacteristics of features in the image frames, as described withreference to FIGS. 24 and 25.

In the embodiment of FIG. 24, the imaging device may be moving in alinear direction and the target objects may be moving in a curvilineardirection.

For example, referring to FIG. 24 (Part A), the imaging device may moveat a speed Vi along the positive x-axis direction and the target objectsmay move at a speed Vt in a curvilinear direction. The speed Vt maycorrespond to a linear speed, and may be calculated using Vt=R. ω, whereR is the radius of an arc (circle) in the curvilinear direction and ω isthe angular speed of the target objects. In the embodiment of FIG. 24(Part A), Vi and Vt may be substantially the same (Vi≅Vt). Accordingly,the background feature may translate at speed Vb′ in the negative x-axisdirection between the first and second image frames, and the targetfeature may translate at speed Vt′ in a curvilinear direction betweenthe first and second image frames, where Vt′Vb′. Based on the movementcharacteristics of the feature points in FIG. 24 (Part A), the imageanalyzer can determine that the imaging device is moving in a lineardirection, that the target objects are moving in a curvilineardirection, and that the target objects and the imaging device are movingat substantially the same speed.

In some embodiments, the imaging device may move in a linear direction,the target objects may move in a curvilinear direction, and the imagingdevice and the target objects may move at different speeds. For example,referring to FIG. 24 (Part B), the imaging device may move at a speed Vialong the positive x-axis direction and the target objects may move at aspeed Vt in a curvilinear direction. Vi may be greater than Vt (Vi>Vt).Accordingly, the background feature may translate at speed Vb′ in thenegative x-axis direction between the first and second image frames, andthe target feature may translate at speed Vt′ in a curvilinear directionbetween the first and second image frames, where Vt′<Vb′. Based on themovement characteristics of the feature points in FIG. 24 (Part B), theimage analyzer can determine that the imaging device is moving in alinear direction, that the target objects are moving in a curvilineardirection, and that the target objects are moving slower than theimaging device.

In the example shown in FIG. 24 (Part C), the imaging device may move ata speed Vi along the positive x-axis direction and the target objectsmay move at a speed Vt in a curvilinear direction. Vi may be less thanVt (Vi<Vt). Accordingly, the background feature may translate at speedVb′ in the negative x-axis direction between the first and second imageframes, and the target feature may translate at speed Vt′ in acurvilinear direction between the first and second image frames, whereVt′>Vb′. Based on the movement characteristics of the feature points inFIG. 24 (Part C), the image analyzer can determine that the imagingdevice is moving in a linear direction, that the target objects aremoving in a curvilinear direction, and that the target objects aremoving faster than the imaging device.

In some embodiments, both the imaging device and the target objects maybe moving in a curvilinear direction, as shown in the embodiment of FIG.25.

For example, referring to FIG. 25 (Part A), the imaging device may moveat a speed Vi in a curvilinear direction and the target objects may moveat a speed Vt in the same curvilinear direction. Vi and Vt may besubstantially the same (Vi≅Vt). Accordingly, the background feature maymove at speed Vb′ in a curvilinear direction between the first andsecond image frames, and the target feature may translate at speed Vt′in a curvilinear direction between the first and second image frames,where Vt′Vb′. Based on the movement characteristics of the featurepoints in FIG. 25 (Part A), the image analyzer can determine that boththe imaging device and the target objects are moving in a curvilineardirection, and that the target objects and the imaging device are movingat substantially the same speed.

In some embodiments, both the imaging device and the target objects maybe moving in a curvilinear direction but at different speeds. Forexample, referring to FIG. 25 (Part B), the imaging device may move at aspeed Vi in a curvilinear direction and the target objects may move at aspeed Vt in a curvilinear direction. Vi may be greater than Vt (Vi>Vt).Accordingly, the background feature may translate at speed Vb′ in acurvilinear direction between the first and second image frames, and thetarget feature may translate at speed Vt′ in a curvilinear directionbetween the first and second image frames, where Vt′<Vb′. Based on themovement characteristics of the feature points in FIG. 25 (Part B), theimage analyzer can determine that both the imaging device and the targetobjects are moving in a curvilinear direction, and that the targetobjects are moving slower than the imaging device.

In the example shown in FIG. 25 (Part C), the imaging device may move ata speed Vi in a curvilinear direction and the target objects may move ata speed Vt in a curvilinear direction. Vi may be less than Vt (Vi<Vt).Accordingly, the background feature may move at speed Vb′ in acurvilinear between the first and second image frames, and the targetfeature may translate at speed Vt′ in a curvilinear direction betweenthe first and second image frames, where Vt′>Vb′. Based on the movementcharacteristics of the feature points in FIG. 25 (Part C), the imageanalyzer can determine that both the imaging device and the targetobjects are moving in a curvilinear direction, and that the targetobjects are moving faster than the imaging device.

In the embodiments of FIGS. 20, 21, and 22, a first movementcharacteristic of a first set of pixels (associated with the backgroundfeature) may comprise a first linear velocity comprising a first lineardirection and a first linear speed. A second movement characteristic ofa second set of pixels (associated with the target feature) may comprisea second linear velocity comprising a second linear direction and asecond linear speed. In some embodiments, the image analyzer may beconfigured to determine that the target object is moving at asubstantially same speed and direction as the imaging device, when thefirst linear direction is parallel to the second linear direction inopposite directions and when the first linear speed is the same as thesecond linear speed (see, e.g., FIG. 20A).

In some embodiments, the image analyzer may be configured to determinethat the target object is moving in a substantially same direction asthe imaging device and at a different speed from the imaging device,when the first linear direction is parallel to the second lineardirection in opposite directions and when the first linear speed isdifferent from the second linear speed (see, e.g., FIG. 20 (Parts B andC)). In those embodiments, the image analyzer may be configured todetermine that the target object is moving faster than the imagingdevice when the first linear speed is less than the second linear speed(see, e.g., FIG. 20 (Part C)), or that the target object is movingslower than the imaging device when the first linear speed is greaterthan the second linear speed (see, e.g., FIG. 20 (Part B)).

In some embodiments, the image analyzer may be configured to determinethat the target object is stationary or at rest, when the first lineardirection is parallel to the second linear direction in a same directionand when the first linear speed is the same as the second linear speed(see, e.g., FIG. 21 (Part A)).

In some embodiments, the image analyzer may be configured to determinethat the target object and the imaging device are moving in oppositedirections at different speeds, when the first linear direction isparallel to the second linear direction in a same direction and when thefirst linear speed is different from the second linear speed (see, e.g.,FIG. 21 (Part B and C)). In those embodiments, the image analyzer may beconfigured to determine that the target object is moving faster than theimaging device when the first linear speed is less than the secondlinear speed (see, e.g., FIG. 21 (Part C)), or that the target object ismoving slower than the imaging device when the first linear speed isgreater than the second linear speed (see, e.g., FIG. 21 (Part B)).

In some other embodiments, the image analyzer may be configured todetermine that the target object is moving in a different direction fromthe imaging device and at a substantially same speed as the imagingdevice, when the first linear direction is different from the secondlinear direction and when the first linear speed is substantially thesame as the second linear speed (see, e.g., FIG. 22 (Part A)). In thoseembodiments, the image analyzer may be capable of determining whetherthe first linear direction is oblique to the second linear direction.

In some further embodiments, the image analyzer may be configured todetermine that the target object is moving in a different direction fromthe imaging device and at a different speed from the imaging device,when the first linear direction is different from the second lineardirection and when the first linear speed is different from the secondlinear speed (see, e.g., FIG. 22 (Parts B and C)). In those embodiments,the image analyzer may be capable of determining whether the firstlinear direction is oblique to the second linear direction. The imageanalyzer may be further configured to determine that the target objectis moving faster than the imaging device when the first linear speed isless than the second linear speed (see, e.g., FIG. 22 (Part C)), or thatthe target object is moving slower than the imaging device when thefirst linear speed is greater than the second linear speed (see, e.g.,FIG. 22 (Part B)).

In some embodiments, the first movement characteristic of the first setof pixels (associated with the background feature) may further comprisea first curvilinear velocity comprising a first curvilinear directionand a first curvilinear speed. The second movement characteristic of thesecond set of pixels (associated with the target feature) may comprise asecond curvilinear velocity comprising a second curvilinear directionand a second curvilinear speed. In some embodiments, the image analyzermay be configured to determine that the target object and the imagingdevice are moving in the same curvilinear direction and at the samecurvilinear speed (see, e.g., FIG. 25 (Part A)).

In some embodiments, the image analyzer may be configured to determinethat the target object and the imaging device are moving in the samecurvilinear direction and at different curvilinear speeds (see, e.g.,FIG. 25 (Parts B and C)). In those embodiments, the image analyzer maybe configured to determine that the target object is moving faster thanthe imaging device when the first curvilinear speed is less than thesecond curvilinear speed (see, e.g., FIG. 25 (Part C)), or that thetarget object is moving slower than the imaging device when the firstcurvilinear speed is greater than the second curvilinear speed (see,e.g., FIG. 25 (Part B)).

In some embodiments, the imaging device may move in a linear directionand the target object may move in a curvilinear direction (see, e.g.,FIG. 24). In some other embodiments, the imaging device may move in acurvilinear direction and the target object may move in a lineardirection. In some further embodiments, the imaging device and thetarget object may move in both linear and/or curvilinear directions atdifferent times. Any motion of the imaging device and the target object(linear, non-linear, curvilinear, zig-zag, random patterns, etc.) may becontemplated.

In some embodiments, the acceleration of each pixel further comprises alinear acceleration and/or an angular acceleration of each pixel. Forexample, the first movement characteristic of the first set of pixels(associated with the background feature) may comprise a first linearacceleration and/or a first angular acceleration. The second movementcharacteristic of the second set of pixels (associated with the targetfeature) may comprise a second linear acceleration and/or a secondangular acceleration.

The image analyzer may be configured to determine that the target objectis accelerating relative to the background object and the imaging devicewhen the first linear acceleration is different from the second linearacceleration. For example, the image analyzer can determine that thetarget object is accelerating faster than the imaging device when thefirst linear acceleration is less than the second linear acceleration,or that the target object is accelerating slower than the imaging devicewhen the first linear acceleration is greater than the second linearacceleration.

Likewise, the image analyzer may be configured to determine that thetarget object is accelerating relative to the background object and theimaging device when the first angular acceleration is different from thesecond angular acceleration. For example, the image analyzer candetermine that the target object is accelerating faster than the imagingdevice when the first angular acceleration is less than the secondangular acceleration, or that the target object is accelerating slowerthan the imaging device when the first angular acceleration is greaterthan the second angular acceleration.

FIG. 26 illustrates a visual tracking system 2600 for tracking a groupof feature points by adjusting motion characteristics of a mobile visualtracking device, in accordance with some embodiments. In the embodimentof FIG. 26, the system may include a feedback loop for analyzed signals2622 that are output from image analyzer 2620. The analyzed signals maybe provided back to a motion controller 2640 comprising a motion sensingmodule 2630. In some embodiments, the motion controller and the motionsensing module may be provided on different components or devices. Themotion controller may be configured to track a group of feature pointsby adjusting motion characteristics of a mobile visual tracking device.The imaging device may be mounted or supported on the mobile visualtracking device. The mobile visual tracking device may be a UAV. Themotion sensing module may be configured to sense motion of the imagingdevice and/or the mobile visual tracking device, and provide motionsignals 2632 to the image analyzer. The motion signals may includemotion characteristics of the imaging device and/or the mobile visualtracking device.

The image analyzer may be configured to obtain movement characteristicsof a plurality of feature points, based on image signals 2612 providedby the imaging device and the motion signals provided by the motionsensing module. The image analyzer may be further configured to select agroup of feature points from the plurality of feature points based onthe movement characteristics of the plurality of feature points.Movement information associated with the group of feature points may beprovided back to the motion controller via the analyzed signals. Themotion controller may be configured to track the group of feature pointsby adjusting motion characteristics of the mobile visual trackingdevice/imaging device, so as to substantially position the group offeature points in a target region of each image frame captured using theimaging device.

In the example of FIG. 26, the image analyzer may be configured tosupport visual tracking of one or more target objects. For example, theimage analyzer may be configured to receive the plurality of imagesignals from the imaging device. The image signals may be indicative ofa plurality of image frames (e.g. a first image frame 2612-1 and asecond image frame 2612-2) captured by the imaging device over a periodof time (e.g., at times T1 and T2, respectively) while the mobile visualtracking device/imaging device is in motion. Each image frame maycomprise a plurality of pixels. The image analyzer may be furtherconfigured to obtain the motion characteristics of the mobile visualtracking device based on the plurality of motion signals, and to analyzethe plurality of image signals based on the motion characteristics ofthe mobile visual tracking device, so as to compute movementcharacteristics associated with the plurality of pixels. The selectivetracking of a group of feature points can be implemented using anoptical flow algorithm, and will be described in further detail withreference to FIG. 27. Specifically, FIG. 27 illustrates the tracking ofa group of feature points in exemplary images using the mobile visualtracking system of FIG. 26, in accordance with some embodiments.

Referring to FIG. 27, an image analyzer (e.g., image analyzer 2620 ofFIG. 26) may receive a plurality of image signals from an imaging device2710. The imaging device may be mounted on a UAV. The image signals maycomprise a first image frame 2712-1 captured at time T1 at location 1and a second image frame 2712-2 captured at time T2 at location 1,whereby time T2 may be a point in time occurring after time T1. At timeT1, a selected group of feature points (e.g., tracking featureassociated with a plurality of target objects) may be positioned withina target region (dotted rectangular box) of the first image frame. Attime T2, the selected group of feature points may have moved outside ofthe target region of the second image frame. In the example of FIG. 27,the target region may be a central region of each image frame. In otherembodiments, the target region may be an edge region of each imageframe. In some embodiments, a size of the target feature in the imageframes can be adjusted by causing the imaging device to zoom in closerto the target objects, or to zoom further away from the target objects.In some embodiments, each image frame may comprise a plurality of targetregions located at different locations or overlapping with one another.

Although FIG. 27 depicts three image frames, any number of images framesmay be contemplated. For example, in some embodiments, the image signalsmay comprise a plurality of image frames 2712-1 to 2712-n captured overa period of time starting from T1 to Tn at respective locations 1 to m,where m and n may be any integer greater than 1.

In some embodiments, a greater number of image frames may be capturedwhen the target object and/or the imaging device are moving quickly, anda fewer number of image frames may be captured when the target objectand/or the imaging device are moving slowly.

Each image frame may comprise a plurality of pixels that are associatedwith a plurality of feature points. As shown in FIG. 27, the featurepoints may be associated with target objects (e.g., a group of people)and background objects (e.g., buildings, trees, golf course, gasstation, etc.). In the example of FIG. 27, the target objects may belocated at a first position at time T1 (see first image frame 2712-1)and moved to a second position at time T2 (see second image frame2712-2).

In the example of FIG. 27, movement information associated with thegroup of feature points may be provided back to the motion controllervia the analyzed signals. The motion controller may be configured totrack the group of feature points by adjusting motion characteristics ofthe mobile visual tracking device (e.g., by moving the tracking devicefrom location 1 to location 2), so as to substantially position thegroup of feature points in each target region. Accordingly, the group offeature points may be substantially positioned in the target region of athird image frame 2712-3 captured at time T3 at location 2.

The motion characteristics of the mobile visual tracking device may beadjusted such that the motion characteristics of the mobile visualtracking device are substantially the same as the movementcharacteristics of the group of feature points. The movementcharacteristics of the group of feature points may comprise at least avelocity and/or an acceleration of the group of feature points. Thevelocity of the mobile visual tracking device may be associated with thevelocity of the group of feature points. Likewise, the acceleration ofthe mobile visual tracking device may be associated with theacceleration of the group of feature points. Accordingly, the motioncontroller can adjust the velocity and/or acceleration of the mobilevisual tracking device to track the group of feature points, so as tosubstantially position the group of feature points in each targetregion.

In some embodiments, when the mobile visual tracking device is carriedby a movable apparatus such as a UAV, a movement characteristic of theUAV may be adjusted so as to allow the mobile visual tracking device totrack the group of feature points. In some embodiments, the mobilevisual tracking device may comprise an imaging device. In someembodiments, the motion controller may be configured to adjust themovement of the imaging device relative to the movement of the UAV totrack the group of feature points. In some embodiments, the imagingdevice may be supported by a movable apparatus. The movable apparatusmay be an unmanned aerial vehicle (UAV). The movable apparatus maycomprise a carrier for the imaging device that permits the imagingdevice to move relative to a supporting structure on the movableapparatus. In some embodiments, the group of feature points may bepositioned at all times in a field-of-view of the imaging device.

As previously described, the motion controller may be configured totrack the group of feature points by adjusting motion characteristics ofthe mobile visual tracking device/imaging device, so as to substantiallyposition the group of feature points in a target region of each imageframe captured using the imaging device. The motion characteristics ofthe mobile visual tracking device/imaging device may be adjusted viatranslational movement of the device, rotational movement of the device,curvilinear motion of the device, changing orientation (e.g., attitude,pitch, roll, yaw) of the device, zoom-in or zoom-out (magnification) ofthe device, or any combination of the above. In some embodiments, themotion characteristics of the mobile visual tracking device/imagingdevice may be adjusted based on certain preferential parameters (e.g.,the device staying within a predetermined distance to the targetobjects, or keeping a minimum distance away from the target objects).

In some embodiments, the mobile visual tracking device may be configuredto track a group of feature points so long as the group of featurepoints have substantially a same movement characteristic. For example,the group of feature points may be generally moving in a same direction.The mobile visual tracking device may be configured to track the groupof feature points independent of a size and/or a shape of the group offeature points.

In the embodiment of FIG. 27, the mobile visual tracking device is showntracking a group of feature points surrounded by a contour havingsubstantially the same shape and size, as the target objects move fromone location to another location. In some embodiments, the mobile visualtracking device can track a group of feature points surrounded by acontour having an amorphous shape and/or changing size, for example asshown in FIG. 28. In the embodiment of FIG. 28, the size and/or shape ofthe contour surrounding the group of feature points changes over time asthe number of target objects changes, or when the target objects movecollectively in a random manner. For example, the size and/or shape ofthe contour may be different as the target objects move betweendifferent locations at times T1, T2, and T3. The motion controller canadjust the motion characteristics of the mobile visual tracking deviceto track the constantly changing group of feature points, so as tosubstantially position the group of feature points in each targetregion.

In some embodiments, the group of feature points may comprise aplurality of subsets of feature points. The plurality of subsets offeature points may comprise a first subset and a second subset offeature points. The first and second subsets of feature points may havesubstantially the same movement characteristic. The mobile visualtracking device may be configured to track the first and second subsetsof feature points having substantially the same movement characteristic,as illustrated in FIG. 29 (Part A).

In some alternative embodiments, the first and second subsets of featurepoints may have substantially different movement characteristics. Inthose embodiments, the mobile visual tracking device may be configuredto track at least one of the first or the second subsets of featurepoints. For example, in some instances, the mobile visual trackingdevice may be configured to track the first subset of feature pointswhen a size of the first subset of feature points is greater than a sizeof the second subset of feature points, as illustrated in FIG. 29 (PartB). In other instances, the mobile visual tracking device may beconfigured to track the first subset of feature points when a size ofthe first subset of feature points is smaller than a size of the secondsubset of feature points. The mobile visual tracking device may trackany particular subset of feature points depending on variouscharacteristics associated with that subset of feature points. Exemplarycharacteristics may include size (as described above), shape, movementcharacteristics, etc. The movement characteristics may include speed,acceleration, or orientation of the feature points. In some embodiments,the subset of feature points may be tracked based on multi-factorweighting (e.g., based on a plurality of different factors relatingsize, shape, speed, orientation, etc.). In some embodiments, thetracking device may be configured to track the feature points for aslong as possible (for example, by zooming out to increase thefield-of-view if the feature points begin to diverge), and to select oneor more of the subsets of feature points if all of the feature pointscannot be substantially tracked with sufficient clarity/detail. In someembodiments, when the feature points start to diverge, the imagingdevice may move to a higher vertical location relative to the targetobjects, or a further lateral distance away from the target objects, sothat the tracking feature can be positioned in the field-of-view of theimaging device or in a target region of the image frames captured by theimaging device.

In some embodiments, sensors and/or processors may be coupled withmovable objects. Movable objects may be an unmanned movable object, suchas an unmanned aerial vehicle. In some embodiments, the sensors maycomprise imaging devices such as cameras. One or more imaging devicesmay be carried by a UAV. Any description herein of UAVs may apply to anyother type of movable objects as desired. In some embodiments, theprocessor may be an embedded processor carried by the UAV.Alternatively, the processor may be separated from the UAV (e.g., at aground station, communicating with the UAV or a movable remotecontroller communicating with the UAV). The UAV may utilize the imagingdevices as described herein to carry out operations (e.g., in thecontext of visual tracking). For example, the processors on the UAV mayanalyze the images captured by the imaging devices and use them toidentify and/or track target objects. The UAV may utilize computervision to self-navigate within an environment. Self-navigation mayinclude determining a local or global location of the UAV, orientationof the UAV, detection and avoidance of obstacles, and the like. Imagingdevices of the present disclosure can be situated on any suitableportion of the UAV, such as above, underneath, on the side(s) of, orwithin a vehicle body of the UAV. Some imaging devices can bemechanically coupled to the UAV such that the spatial disposition and/ormotion of the UAV correspond to the spatial disposition and/or motion ofthe imaging device. The imaging devices can be coupled to the UAV via arigid coupling, such that the imaging device does not move relative tothe portion of the UAV to which it is attached. Alternatively, thecoupling between the imaging device and the UAV can permit movement(e.g., translational or rotational movement relative to the UAV) of theimaging device relative to the UAV. The coupling can be a permanentcoupling or non-permanent (e.g., releasable) coupling. Suitable couplingmethods can include adhesives, bonding, welding, and/or fasteners (e.g.,screws, nails, pins, etc.). Optionally, the imaging device can beintegrally formed with a portion of the UAV. Furthermore, the imagingdevice can be electrically coupled with a portion of the UAV (e.g.,processing unit, control system, data storage) so as to enable the datacollected by the imaging device to be used for various functions of theUAV (e.g., navigation, control, propulsion, communication with a user orother device, etc.), such as the embodiments discussed herein. Theimaging device may be operably coupled with a portion of the UAV (e.g.,processing unit, control system, data storage). One or more imagingdevices may be situated on the UAV. For example, 1, 2, 3, 4, 5 or moreimaging devices may be situated on the UAV. The one or more imagingdevices may have the same field-of-view (FOV) or a different FOV. Eachof the one or more imaging devices may be coupled to one or moreprocessors. Each of the one or more imaging devices may individually orcollectively perform the methods mentioned herein. The one or moreimaging devices may capture images each with a desired texture quality.Each imaging device may capture images what are utilized for the same ordifferent function (e.g., visual tracking application). For example, aUAV may be coupled with two imaging devices, one which tracks a group oftarget objects, and another that captures images that are utilized fornavigation or self-positioning.

As previously described, the imaging device can be mounted on a trackingdevice. The tracking device may be a UAV. In some instances, thetracking device may be implemented on or provided in a UAV. Anydescription herein of a UAV may apply to any other type of aerialvehicle, or any other type of movable object, and vice versa. Thetracking device may be capable of self-propelled motion. The descriptionof a UAV may apply to any type of unmanned movable object (e.g., whichmay traverse the air, land, water, or space). The UAV may be capable ofresponding to commands from a remote controller. The remote controllerneed not be physically connected to the UAV, and may communicate withthe UAV wirelessly from a distance. In some instances, the UAV may becapable of operating autonomously or semi-autonomously. The UAV may becapable of following a set of pre-programmed instructions. In someinstances, the UAV may operate semi-autonomously by responding to one ormore commands from a remote controller while otherwise operatingautonomously. For instance, one or more commands from a remotecontroller may initiate a sequence of autonomous or semi-autonomousactions by the UAV in accordance with one or more parameters.

The UAV may have one or more propulsion units that may permit the UAV tomove about in the air. The one or more propulsion units may enable theUAV to move about one or more, two or more, three or more, four or more,five or more, six or more degrees of freedom. In some instances, the UAVmay be able to rotate about one, two, three or more axes of rotation.The axes of rotation may be orthogonal to one another. The axes ofrotation may remain orthogonal to one another throughout the course ofthe UAV's flight. The axes of rotation may include a pitch axis, rollaxis, and/or yaw axis. The UAV may be able to move along one or moredimensions. For example, the UAV may be able to move upwards due to thelift generated by one or more rotors. In some instances, the UAV may becapable of moving along a Z axis (which may be up relative to the UAVorientation), an X axis, and/or a Y axis (which may be lateral). The UAVmay be capable of moving along one, two, or three axes that may beorthogonal to one another.

The UAV may be a rotorcraft. In some instances, the UAV may be amulti-rotor craft that may include a plurality of rotors. The pluralityof rotors may be capable of rotating to generate lift for the UAV. Therotors may be propulsion units that may enable the UAV to move aboutfreely through the air. The rotors may rotate at the same rate and/ormay generate the same amount of lift or thrust. The rotors mayoptionally rotate at varying rates, which may generate different amountsof lift or thrust and/or permit the UAV to rotate. In some instances,one, two, three, four, five, six, seven, eight, nine, ten, or morerotors may be provided on a UAV. The rotors may be arranged so thattheir axes of rotation are parallel to one another. In some instances,the rotors may have axes of rotation that are at any angle relative toone another, which may affect the motion of the UAV.

The UAV may have a housing. The housing may include one or more internalcavities. The UAV may include a central body. The UAV may optionallyhave one or more arms branching from the central body. The arms maysupport the propulsion units. One or more branch cavities may be withinthe arms of the UAV. The housing may or may not include the arms thatbranch from the central body. In some instances, the housing may beformed from an integral piece that encompasses the central body and thearms. Alternatively, separate housings or pieces are used to form thecentral body and arms.

Optionally, the tracking device may be movable by changing spatiallocation (e.g., translating in an X direction, Y direction, and/or Zdirection). Alternatively or in combination, the tracking device may beconfigured to change orientation within space. For instance, thetracking device may be capable of rotating about a yaw axis, a pitchaxis, and/or a roll axis. In one example, the tracking device may notsubstantially change spatial location, but may change angularorientation (e.g., a security camera mounted on a stationary support,such as a structure). In another example, the tracking device may notsubstantially change orientation but may change spatial location. Insome instances, the tracking device may be capable of both changingspatial location and angular orientation.

FIG. 30 illustrates a movable object 3000 including a carrier 3002 and apayload 3004, in accordance with embodiments. Although the movableobject 3000 is depicted as an aircraft, this depiction is not intendedto be limiting, and any suitable type of movable object can be used, aspreviously described herein. One of skill in the art would appreciatethat any of the embodiments described herein in the context of aircraftsystems can be applied to any suitable movable object (e.g., an UAV).

In some embodiments, the movable object 3000 may be a UAV. The UAV caninclude a propulsion system having any number of rotors (e.g., one, two,three, four, five, six, or more). The rotors or other propulsion systemsof the unmanned aerial vehicle may enable the unmanned aerial vehicle tohover/maintain position, change orientation, and/or change location. Thedistance between shafts of opposite rotors can be any suitable length.For example, the length can be less than or equal to 2 m, or less thanequal to 5 m. In some embodiments, the length can be within a range from40 cm to 7 m, from 70 cm to 2 m, or from 5 cm to 5 m. Any descriptionherein of a UAV may apply to a movable object, such as a movable objectof a different type, and vice versa.

In some instances, the payload 3004 may be provided on the movableobject 3000 without requiring the carrier 3002. The movable object 3000may include propulsion mechanisms 3006, a sensing system 3008, and acommunication system 3010. The propulsion mechanisms 3006 can includeone or more of rotors, propellers, blades, engines, motors, wheels,axles, magnets, or nozzles, as previously described herein. The movableobject may have one or more, two or more, three or more, or four or morepropulsion mechanisms. The propulsion mechanisms may all be of the sametype. Alternatively, one or more propulsion mechanisms can be differenttypes of propulsion mechanisms. In some embodiments, the propulsionmechanisms 3006 can enable the movable object 3000 to take offvertically from a surface or land vertically on a surface withoutrequiring any horizontal movement of the movable object 3000 (e.g.,without traveling down a runway). Optionally, the propulsion mechanisms3006 can be operable to permit the movable object 3000 to hover in theair at a specified position and/or orientation.

For example, the movable object 3000 can have multiple horizontallyoriented rotors that can provide lift and/or thrust to the movableobject. The multiple horizontally oriented rotors can be actuated toprovide vertical takeoff, vertical landing, and hovering capabilities tothe movable object 3000. In some embodiments, one or more of thehorizontally oriented rotors may spin in a clockwise direction, whileone or more of the horizontally rotors may spin in a counterclockwisedirection. For example, the number of clockwise rotors may be equal tothe number of counterclockwise rotors. The rotation rate of each of thehorizontally oriented rotors can be varied independently in order tocontrol the lift and/or thrust produced by each rotor, and therebyadjust the spatial disposition, velocity, and/or acceleration of themovable object 3000 (e.g., with respect to up to three degrees oftranslation and up to three degrees of rotation).

The sensing system 3008 can include one or more sensors that may sensethe spatial disposition, velocity, and/or acceleration of the movableobject 3000 (e.g., with respect to up to three degrees of translationand up to three degrees of rotation). The one or more sensors caninclude global positioning system (GPS) sensors, motion sensors,inertial sensors, proximity sensors, or image sensors. The sensing dataprovided by the sensing system 3008 can be used to control the spatialdisposition, velocity, and/or orientation of the movable object 3000(e.g., using a suitable processing unit and/or control module, asdescribed below). Alternatively, the sensing system 3008 can be used toprovide data regarding the environment surrounding the movable object,such as weather conditions, proximity to potential obstacles, locationof geographical features, location of manmade structures, and the like.

The sensing system may include image sensors, imaging devices, and/orimage analyzers (e.g., image analyzer 120 of FIG. 1) as describedherein. The sensing system may also include a motion sensing module(e.g., motion sensing module 1530 of FIG. 15) as described herein. Thesensing system may further include a motion controller (e.g., motioncontroller 2640 of FIG. 26) as described herein. The motion sensingmodule may be configured to sense motion of the imaging device and/or amobile visual tracking device, and provide motion signals to the imageanalyzer. The motion signals may include motion characteristics of theimaging device and/or the mobile visual tracking device. The imageanalyzer may be configured to obtain movement characteristics of aplurality of feature points, based on image signals provided by theimaging device and the motion signals provided by the motion sensingmodule. The image analyzer may be further configured to select a groupof feature points from the plurality of feature points based on themovement characteristics of the plurality of feature points.

Movement information associated with the group of feature points may beprovided back to the motion controller via the analyzed signals. Themotion controller may be configured to track the group of feature pointsby adjusting motion characteristics of the mobile visual trackingdevice/imaging device, so as to substantially position the group offeature points in a target region of each image frame captured using theimaging device. The motion controller may be configured to track a groupof feature points by adjusting motion characteristics of a mobile visualtracking device.

Accordingly, one or more of the components in the above sensing systemcan enable precise tracking of a moving target object and/or a group ofmoving target objects under different conditions. The conditions mayinclude both indoor and outdoor environments, places without GPS signalsor places that have poor GPS signal reception, a variety of differentterrain, etc. The target objects may include target objects that do notcarry GPS apparatus, target objects that do not have well-definedfeatures or that do not fall into known object classes, target objectsthat collectively form a group whereby the size and/or shape of thegroup may be amorphous and change over time, a plurality of differenttarget objects moving in different formations, or any combination of theabove.

The communication system 3010 enables communication with terminal 3012having a communication system 3014 via wireless signals 3016. In someembodiments, the terminal may include an image analyzer, a motionsensing module, and/or a motion controller as described elsewhereherein. The communication systems 3010, 3014 may include any number oftransmitters, receivers, and/or transceivers suitable for wirelesscommunication. The communication may be one-way communication, such thatdata can be transmitted in only one direction. For example, one-waycommunication may involve only the movable object 3000 transmitting datato the terminal 3012, or vice-versa. The data may be transmitted fromone or more transmitters of the communication system 3010 to one or morereceivers of the communication system 3012, or vice-versa.Alternatively, the communication may be two-way communication, such thatdata can be transmitted in both directions between the movable object3000 and the terminal 3012. The two-way communication can involvetransmitting data from one or more transmitters of the communicationsystem 3010 to one or more receivers of the communication system 3014,and vice-versa.

In some embodiments, the terminal 3012 can provide control data to oneor more of the movable object 3000, carrier 3002, and payload 3004 andreceive information from one or more of the movable object 3000, carrier3002, and payload 3004 (e.g., position and/or motion information of themovable object, carrier or payload; data sensed by the payload such asimage data captured by a payload camera). In some embodiments, themovable object 3000 can be configured to communicate with another remotedevice in addition to the terminal 3012, or instead of the terminal3012. The terminal 3012 may also be configured to communicate withanother remote device as well as the movable object 3000. For example,the movable object 3000 and/or terminal 3012 may communicate withanother movable object, or a carrier or payload of another movableobject. When desired, the remote device may be a second terminal orother computing device (e.g., computer, laptop, tablet, smartphone, orother mobile device). The remote device can be configured to transmitdata to the movable object 3000, receive data from the movable object3000, transmit data to the terminal 3012, and/or receive data from theterminal 3012. Optionally, the remote device can be connected to theInternet or other telecommunications network, such that data receivedfrom the movable object 3000 and/or terminal 3012 can be uploaded to awebsite or server.

While some embodiments of the present disclosure have been shown anddescribed herein, it will be obvious to those skilled in the art thatsuch embodiments are provided by way of example only. Numerousvariations, changes, and substitutions will now occur to those skilledin the art without departing from the disclosure. It should beunderstood that various alternatives to the embodiments of thedisclosure described herein may be employed in practicing thedisclosure. It is intended that the following claims define the scope ofthe invention and that methods and structures within the scope of theseclaims and their equivalents be covered thereby.

What is claimed is:
 1. A method for supporting visual tracking, themethod comprising: receiving a plurality of image signals indicative ofa plurality of image frames captured by an imaging device over a periodof time while the imaging device is in motion, wherein each image framecomprises a plurality of pixels; obtaining motion characteristics of theimaging device based on a plurality of motion signals; and analyzing theplurality of image signals based on the motion characteristics of theimaging device, so as to compute movement characteristics associatedwith the plurality of pixels.
 2. The method of claim 1, whereinanalyzing the plurality of image signals based on the motioncharacteristics of the imaging device comprises: obtaining a correlationbetween the movement characteristics associated with the plurality ofpixels and the motion characteristics of the imaging device.
 3. Themethod of claim 1, wherein the plurality of pixels are associated with aplurality of feature points, each feature point associated with abackground feature or a tracking feature.
 4. The method of claim 3,wherein the background feature is associated with a first set of pixelshaving a first movement characteristic and the tracking feature isassociated with a second set of pixels having a second movementcharacteristic that is different from the first movement characteristic.5. The method of claim 3, further comprising: adjusting movement of theimaging device to position a group of feature points associated with thetracking feature in a target region of a subsequent image frame capturedby the imaging device.
 6. The method of claim 3, further comprising:selecting a subset of feature points associated with the trackingfeature based on size, shape, movement characteristics of the featurepoints associated with the tracking feature; and using the subset offeature points to track the tracking feature.
 7. The method of claim 1,wherein analyzing the plurality of image signals based on the motioncharacteristics of the imaging device comprises: identifying abackground feature of each image frame based on the motioncharacteristics of the imaging device; and identifying a trackingfeature relative to the background feature.
 8. The method of claim 7,wherein identifying the background feature of each image based on themotion characteristics of the imaging device comprises: identifyingfeature points that move across the image frames at a speed that isscaled in proportion to a speed of the imaging device to identify thebackground feature.
 9. The method of claim 7, wherein identifying thebackground feature of each image based on the motion characteristics ofthe imaging device comprises: identifying feature points that moveacross the image frames at a movement direction that is opposite to amovement direction of the imaging device to identify the backgroundfeature.
 10. The method of claim 7, wherein identifying the trackingfeature relative to the background feature comprises: identifyingfeatures points that move across the image frames at a speed that isdifferent from a speed of feature points associated with the backgroundfeature to identify the tracking feature.
 11. The method of claim 7,wherein identifying the tracking feature relative to the backgroundfeature comprises: identifying features points that move across theimage frames at a movement direction that is different from a movementdirection of feature points associated with the background feature toidentify the tracking feature.
 12. The method of claim 1, wherein themovement characteristics associated with the plurality of pixelscomprise at least one of a velocity or an acceleration of each of theplurality of pixels as measured across the plurality of image frames.13. The method of claim 1, wherein the motion characteristics of theimaging device comprise at least one of an attitude, an instantaneousposition, a velocity, or an acceleration of the imaging device.
 14. Themethod of claim 13, further comprising: determining the instantaneousposition of the imaging device using a range-finding and/or locatingdevice.
 15. The method of claim 13, wherein the instantaneous positionof the imaging device is determined relative to a physical location of abackground feature.
 16. The method of claim 15, further comprises:calculating a scaling factor based on the instantaneous position of theimaging device and the physical location of the background feature; andcomputing the movement characteristic of each of the plurality of pixelsusing the motion characteristics of the imaging device and the scalingfactor.
 17. The method of claim 1, wherein the motion characteristics ofthe imaging device are determined using an inertial measurement sensor.18. An apparatus for supporting visual tracking, the apparatuscomprising one or more processors that are, individually orcollectively, configured to: receive a plurality of image signalsindicative of a plurality of image frames captured by an imaging deviceover a period of time while the imaging device is in motion, whereineach image frame comprises a plurality of pixels; obtain motioncharacteristics of the imaging device based on a plurality of motionsignals; and analyze the plurality of image signals based on the motioncharacteristics of the imaging device, so as to compute movementcharacteristics associated with the plurality of pixels.
 19. Theapparatus of claim 18, wherein the apparatus is an unmanned aerialvehicle (UAV).
 20. A non-transitory computer-readable medium storinginstructions that, when executed, cause a computer to perform a methodfor supporting visual tracking, the method comprising: receiving aplurality of image signals indicative of a plurality of image framescaptured by an imaging device over a period of time while the imagingdevice is in motion, wherein each image frame comprises a plurality ofpixels; obtaining motion characteristics of the imaging device based ona plurality of motion signals; and analyzing the plurality of imagesignals based on the motion characteristics of the imaging device, so asto compute movement characteristics associated with the plurality ofpixels.